Comprehensive NWT Comparison Project (calling all technically skilled members)

by Apognophos 223 Replies latest watchtower bible

MeanMrMustard

@Apognophos,

I see. Thanks for the explanation. I did a quick search for a Diff(DELETE,"YOU") in my change log and come up with this (count by book):

2 Ecclesiastes
45 Ephesians
134 Numbers
136 1 Corinthians
11 Ezra
95 2 Corinthians
3 Nahum
7 Song of Solomon
33 2 Samuel
34 Hebrews
148 Psalms
12 Daniel
61 Job
2 Obadiah
37 Philippians
4 2 John
153 Jeremiah
11 Revelation
43 2 Chronicles
72 1 Samuel
88 Mark
1 3 John
6 Hosea
54 Judges
39 James
19 2 Peter
4 Habakkuk
11 Joel
206 Deuteronomy
7 Haggai
63 Romans
7 Jude
10 Micah
157 Exodus
179 Ezekiel
101 Genesis
4 Zephaniah
210 Matthew
51 1 Thessalonians
21 Malachi
25 2 Thessalonians
17 Proverbs
13 Zechariah
50 Galatians
6 Lamentations
172 Leviticus
2 Jonah
28 1 Kings
17 Nehemiah
142 Isaiah
24 1 John
1 Philemon
51 Colossians
194 John
2 Esther
196 Luke
24 Amos
39 2 Kings
109 Acts
15 1 Chronicles
51 1 Peter
83 Joshua
11 Ruth

MMM
MeanMrMustard

@wallsofjericho,

Looks like Titus 2:13 is the only scripture with a change of Jesus Christ from Christ Jesus... ok then. :)

MMM
doneandout

I got your pdf links. :)

This will save me tons of time to read through to see the changes manually.
lrkr

So... it looks like they took any magic or mystery that was left in the text out. I understand the "seed of the woman" is now the offspring. To me seed is very different than offspring. (one is animate and current, the other could be potential and latent) Also eliminating all of the "came to be" removes that beautiful reference to "TO BE" in the text.

Just when you thought it couldnt get any worse than replacing "grace" with "undeserved kindness" they further castrated the poetry.

I'm an athiest- so I dont really find religious meaning- but the poetry and symbolism have been centuries in the making.
Phizzy

I think they have moved the average JW a lot further away from ever getting an appreciation of what the Bible writers were saying.

When I was an active JW I used to pick up on those rather strange expressions and idioms and research what they were about.

Now, the bland paraphrase by non-scholars that is the RNWT, has removed any term that would cause further research.

The good Translations, (good notice, none are perfect I guess), the good ones do not emasculate the text.
wallsofjericho

sorry if this has already been asked but.... are there more occurences of the name Jehovah? if so, where are they inserted?
fastJehu

@ wallsofjericho

I found in the database from MMM these verses:

book chapter vers
Judges 19 18
1 Samuel 6 3
1 Samuel 10 26
1 Samuel 23 14
1 Samuel 23 16
slii

I don't think the .PUB files in the Watchtower Library are in this same format. I've done lately some reverse engineering of the 2006 version, and I now understand a bit about the format of the publication files.

First of all, I have so far seen nothing to indicate that any portion of the files are encrypted. Large parts of them are compressed, though, using a compression algorithm resembling Huffman which I'm starting to understand (I still don't fully understand the construction of some lookup tables used for decompression).

Some pieces of the textual data (mainly titles) are in uncompressed form, yet this is not immediately obvious from inspecting the files. This is because the system uses internally a 16-bit MEPS-specific character set, which I believe is able to represent multiple scripts but predates Unicode. Overall, I get the impression that whoever designed this knew quite well what they were doing, but there's obviously lots of arcane legacy baggage involved. As to why WTL, or at least the 2006 version, still uses MEPS-coded documents internally, I do not know; perhaps they consider it a useful obfuscation to throw at possible reverse engineers (it did make me scratch my hea for a while), or maybe it's just a legacy thing that hasn't been enough of a problem to touch in code that could be in maintenance-only mode.

For example, if you look at wte.lib, it contains lots of uncompressed strings (publication names, if I remember correctly). Most (all?) .pub files also contain some uncompressed strings. In English-language files, look for places where every other byte is a 08 (hex); most of those will likely be strings. At least uncompressed MEPS strings tend to be stored in a Pascal string like style, i.e. the first 16 bits are the length of the string (in bytes, so it's always an even number), and not null-terminated.

Some 16-bit values seem to be mapped to some kind of control codes that most probably specify things like italics. Most of the Latin alphabet seems to be in the 08xx range. Being a typesetting-oriented coding system, it also seems to contain codes for ligatures; for example, the "ff" ligature is apparently represented as 0851.

Specifically, the English alphabet is mapped to 16-bit values, which are stored in little-endian format, as follows (numbers in hex):

0800..0819 A-Z; 081a..0833 a-z; 0834..083d numbers, "1234567890" (BTW this is the only character set I know of where 0 comes after 9, not before 1)

0841..0844 ":;.,"; 0845, 0846 left and right single quotation marks; 0847..0848 "?!"; 084b..084e "()/-"; 084f em-dash; 0850 en-dash; fb61 <SPACE>

0851 ff ligature; 0865 é (small e with acute); 0x08fb hyphenation point (used a lot e.g. in NWT to show proper hyphenation of names).

fb57 and fb58 are often seen around dashes, as in <fb57>--<fb58>. Perhaps they prevent breaking the line between them?

I guess the overall format of the .pub files and the compression are a topic for another post.
MeanMrMustard

@slii, Hi there! This looks like your one and only post. I've let this thread slide, and I missed your post. Sorry about that.

I don't think the .PUB files in the Watchtower Library are in this same format. I've done lately some reverse engineering of the 2006 version, and I now understand a bit about the format of the publication files.

Good for you! I don't have the stomach to get into those PUB files. I didn't really think the PUB files were encoded the way the mobile app is encoding its data.

First of all, I have so far seen nothing to indicate that any portion of the files are encrypted. Large parts of them are compressed, though, using a compression algorithm resembling Huffman which I'm starting to understand (I still don't fully understand the construction of some lookup tables used for decompression).

You may be correct here.

Some pieces of the textual data (mainly titles) are in uncompressed form, yet this is not immediately obvious from inspecting the files. This is because the system uses internally a 16-bit MEPS-specific character set, which I believe is able to represent multiple scripts but predates Unicode. Overall, I get the impression that whoever designed this knew quite well what they were doing, but there's obviously lots of arcane legacy baggage involved. As to why WTL, or at least the 2006 version, still uses MEPS-coded documents internally, I do not know; perhaps they consider it a useful obfuscation to throw at possible reverse engineers (it did make me scratch my hea for a while), or maybe it's just a legacy thing that hasn't been enough of a problem to touch in code that could be in maintenance-only mode.

Agreed. At this point I think MEPS is dead. Unicode can take its place quite easily and, in fact, be a lot easier to work with. But I think there are a lot of things about the WT Lib that are archane (more on that below).

For example, if you look at wte.lib, it contains lots of uncompressed strings (publication names, if I remember correctly). Most (all?) .pub files also contain some uncompressed strings. In English-language files, look for places where every other byte is a 08 (hex); most of those will likely be strings. At least uncompressed MEPS strings tend to be stored in a Pascal string like style, i.e. the first 16 bits are the length of the string (in bytes, so it's always an even number), and not null-terminated.

Interesting.

Some 16-bit values seem to be mapped to some kind of control codes that most probably specify things like italics. Most of the Latin alphabet seems to be in the 08xx range. Being a typesetting-oriented coding system, it also seems to contain codes for ligatures; for example, the "ff" ligature is apparently represented as 0851.

Specifically, the English alphabet is mapped to 16-bit values, which are stored in little-endian format, as follows (numbers in hex):

0800..0819 A-Z; 081a..0833 a-z; 0834..083d numbers, "1234567890" (BTW this is the only character set I know of where 0 comes after 9, not before 1)

0841..0844 ":;.,"; 0845, 0846 left and right single quotation marks; 0847..0848 "?!"; 084b..084e "()/-"; 084f em-dash; 0850 en-dash; fb61 <SPACE>

0851 ff ligature; 0865 é (small e with acute); 0x08fb hyphenation point (used a lot e.g. in NWT to show proper hyphenation of names).

fb57 and fb58 are often seen around dashes, as in <fb57>--<fb58>. Perhaps they prevent breaking the line between them?

I guess the overall format of the .pub files and the compression are a topic for another post.

Very interesting! You have definitely gotten into the weeds. Ultimately however, I think there might be an easier way. To date I have successfully extracted every piece of text from the 2011 and 2012 WTLibs. I did this without looking at the PUB files and without any manual clicking. I planned on mentioning it in a future thread with some statistics calculated from the text.

I thought it would be cool to see if there are any meaningful differences between the 2011 and 2012 version of text. That is, we expect some differences - new content for the 2012 version, and probably some new entries into the publication index. But aside from that, I am just wondering if there are any textual changes they snuck in. The NWT project turned out to be difficult because there were so many changes involved, and it was meant to be that way. But if you take, say, the Jan 1 1980 version of the WT in the 2011 version and the 2012 version, you would expect that there are no changes between these two documents.

Anyhow, the code base is the same between the 2011 and 2012 version, and probably all the previous versions. They just add new content. I can tell because in the 2011 version has a serious memory leak in it, and the same issue is carried over to the 2012 version.

MMM
DS211

Who do i message to get a pdf

book	chapter	vers
Judges	19	18
1 Samuel	6	3
1 Samuel	10	26
1 Samuel	23	14
1 Samuel	23	16

Comprehensive NWT Comparison Project (calling all technically skilled members)

Share this