Comprehensive NWT Comparison Project (calling all technically skilled members)

by Apognophos 223 Replies latest watchtower bible

  • MeanMrMustard
    MeanMrMustard

    In my limited knowledge, it seems to me that MMM and Simon are saying that they protected it stupid. They encrypted each verse separately, but with the same code. Once the code is broken, the entire package unwraps. But all that extra coding and encrypting means the PDF bible is far bigger than it needs to be.

    Now, I am not in IT, so if I got anything of this wrong, feel free to correct me.

    Well... the only reason I was able to get this was because android app run from a form of java, and you can decompile that. That means I can take the program and reverse the code out of it. Normally, java (.class files) are much easier to decompile. The android java gets mangled down so that some of the code is permanently lost. However, it can be decompiled to an assembly-like code. The logic is still there, but its not very easy to read. It took me a couple nights to get through it. Once I saw what was going on, it was all downhill from there.

    Each publication will have a different SHA-256 hash because that hash value is derived from the language + pub symbol + year. So even different languages would be encrypted a little differently - but the overall algorithm is the same.

    The PDF is public. There was nothing encrypted there. I was reading the PDF with some standard PDF libraries. The issue with the PDF was that it was horrible to parse through given the way PDFs are built.

    MMM

  • MeanMrMustard
    MeanMrMustard

    All,

    More to come later... have to get back to my class...

    MMM

  • MeanMrMustard
    MeanMrMustard

    google-diff-match-patch is great and there's lots of support for it. Not sure if word level is necessary - doing a line level diff (i.e. each verse) is probably better for a first pass and then more specific diffs can be done of those later if needed but most times someone will need to read the whole verse and see the change, less so single words in isolation.

    Simon,

    Hmmm, you may be right. I may be confusing what is meant by "line" level and "word" level. I know the google-diff-match-patch is, by default, a character level diff. I am interpreting a "line level" to mean that the diff will act upon two entire lines - essencially indicating which lines are different. Since, as you note, all these are on one single line (each verse), they are always going to be different on that level. I am interpreting the "word" level to be differences in the groupings of characters separated by whitespace of some sort (words). Like I said, I could have it wrong and what we need is a line level. The good thing is that google-diff-match-match seems to be - as you state - a lot of support. And moving between the two shouldn't be hard. We could do both and see how it works out.

    MMM

  • Apognophos
    Apognophos

    Just a reminder, in case you missed my post, I would like to have a chance to play around with the verses if you have outputted them as text files.

  • Terry
    Terry

    Thanks for the explanations!

    It would a consumation devoutly to be wished if hyperlinks could be inserted which automatically alert the reader of the RNWT of the

    corruptions in the form of *footnotes.

  • MeanMrMustard
    MeanMrMustard

    Apognophos,

    Absolutely! My next task, before I do any diff-match-patch stuff is to dump each verse out to its own text file - one for the old version and one for the new. I can probably do that tonight sometime :)

    MMM

  • MeanMrMustard
    MeanMrMustard

    All,

    I just had another thought about why the WT might want to do this - to hide the fact that MEPS is dead. MEPS was the WTB&TS' publishing system. It allowed them to print in multiple languages and distribute the literature world-wide at a time when that sort of thing was difficult. But when MEPS (Multilanguage Electronic Phototypesetting System) became MEPS (Multilanguage Electronic Publishing System), the flagship publishing system was reduced to a simple Unicode text file. But they can make it look like something else, something special. They can make it look like it's not just a text file. It's a MEPS blob of binary data from an advanced printing system - *wink* *wink*... *nudge* *nudge*

    MMM

  • MeanMrMustard
    MeanMrMustard

    Apognophos,

    I have a dumped DB for you. It is 14.1 MB zipped. I exported each verse into one file, one for the 1984 version, and one for the new 2013 version. The files are grouped under directories for each book and chapter. For example, this is what Genesis 1:5 looks like:

    What is the best way to get this to you?

    MMM

  • Apognophos
    Apognophos

    Cool. Maybe you could put it on sendspace and PM me the link?

  • MeanMrMustard
    MeanMrMustard

    Apognophos,

    Link sent.

    All,

    If anyone else wants a copy of the files, let me know via PM and I'll send you the download link.

    MMM

Share this

Google+
Pinterest
Reddit