A couple of years ago, I participated in a thread on this site comparing the old NWT (1984) to the new NWT (2013). I was able to produce a NWT mash up PDF showing the side-by-side differences between the two versions of the NWT.Here
This site claims it will hold the file for download for a long time. Don’t click the Download button, however. Click the file link toward the middle.
The project was originally started as a way to see what kind of new interpretations the WT was trying to slip into the Bible text; however, the results were ambiguous. There were too many minor changes, and any large change in meaning or interpretation was drowned out by a sea of meaningless modifications. Nevertheless, the NWT Diff project did give me an idea for a new project, one that employed the same diff code, but would produce more interesting results. The idea: Compare the WT Library from one year to the next. That is, see what changes/revisions occur to the content within the WT library from year-to-year. This would especially be interesting for the content of older publications. After all, why would the WT fuss with the language of a 1980s WT, for example?
Also, to me, producing the programming to tackle this problem was the main driver. It could be possible that after all the work is done, no surprising results would be found – perhaps all minor changes. In this case, I would still consider the project a success. Solving the problem of getting all the content of the WT library in two different years and producing a change set between the two, to me, would be the goal. If you are the type that likes computer programming, and you are wondering how one might go about getting the content of the entire WT library, and doing a compare, then this might be the thread for you. Also, if you are interested in taking this project further, I can share the code. For fun, I did the WT library export in C# and the Diff program in VB.NET. If you want the code, you will need a version of Visual Studio 2010 or above. You will have to get your own version of the WT library.
I started the project a few days after the NWT mash up, and a couple weeks later, I had a working version. However, I switched jobs, and life got very busy. So, the project was pushed onto a shelf. I almost completely forgot about it. But, the other night, while watching a nice warm fire, with my fat cat sleeping on my lap, I suddenly remembered the work I had done on the project. So, I decided to clean up what I had, and post it. So what follows are some of the results. For the project, I originally used the 2011 and 2012 WT library. I think I picked those versions initially because I was going to increment through new versions (2012 vs 2013, etc).
The project was broken down into two parts. PART 1: get all of the content from the WT into a readable format. Early on, I decided if I could export the WT into a folder structure that mimicked the WT library structure, with plain text files, then that would be ideal. I would compare file-to-file with the same name on different versions. PART 2: create a comparison program to traverse the WT library exported file structure and produce files for any differences found. If no differences are found, then produce no output for that content.
PART 1 – Getting All the Content of the WT Library
The WT library encrypts the content of the WT library files. Some work was done on the previous Comprehensive NWT thread to decrypt the contents of the WT files directly by using the java code behind the mobile app. However, I chose to go a different route this time. I chose to automate the WT library. Getting the window handle of the top level WT window, the left list ListView, and the middle content window, I was able to force the WT library to navigate itself through the entire library tree. Once each article is displayed in the content window, I used the clipboard to get it out. This was accomplished by using standard Windows API calls to send the proper integer messages to the appropriate windows.
You can see the program in action below. It finds the WT ListView window, displaying the window handles. It then flashes the WT ListView confirming to the user that it found the right. The processing begins and the text is extracted. The full extraction runs overnight.
As it turns out, the WT library never actually releases the memory is consumes when it loads content from an encrypted library file. Why doesn’t it ever become visible to the user? Because the user would have to open hundreds and hundreds of articles before the memory usage becomes burdensome on the system. As soon as the user closes the WT library, then all the memory is naturally released, as the OS reclaims everything the process had reserved.
But for an automation program, it presents a problem. I want to dump the entire WT library. As the program runs, the memory usage rises and reaches a critical point. The OS steps in and kills the process. When the WT library is killed, the location in the hierarchy goes away, as well as the window handle. I got around it by keeping track of the location in the WT hierarchy and then re-executing the program when I detected my window handles became invalid.
Preview (you can see the memory increase, and then the WT is killed. The traversal program brings it back):
PART 2 – Calculating the Differences
Two traversals are needed, one for WT 2011 and one for WT 2012, and both take about 12 hours. But once it is exported, we don’t have to export again. There will be some expected differences. For example, between 2011 and 2012, there will be new entries for the daily text. There may be some publications removed, perhaps to save space on the CD. Also, some folders have a date range, like 1984-2011. In the 2012 version that folder will be different: 1984-2012. The WT versions also have some insignificant differences. Some Greek letters were changed from version to version – the Greek character is the same, but the Unicode value used is a bit different. So the Diff program detects a change. I included a place to ignore certain changes.
I decided to do a character level difference, then take the markup and mash it up, and export the changes only, with some text before and after. This way I could produce a small library of changes I could post here, but not be worried about copyright issues. These are just small quotes around the differences only.
An example of the output is below. It comes from “God’s Love”, 2008 pp 144-159:
The red characters are removed from the 2011 version. The green characters are included in the 2012 version. The white characters exist in both. If you take your time, you can make it out. But if you reference the WT library itself, this is the change:
Easter has also been linked to the worship of the Phoenician fertility goddess, Astarte, who had as her symbols the egg and the hare. Statues of Astarte have variously depicted her as having exaggerated sex organs or with a rabbit beside her and an egg in her hand.
Eostre (or Eastre) was also a fertility goddess. According to The Dictionary of Mythology, “she owned a hare in the moon which loved eggs and she was sometimes depicted as having the head of a hare.”
I wonder if they found the 2011 version to be inaccurate in some way.
Below is the link to the results. It contains a ZIP file. Inside contains the structure of the WT library with all differences logged in individual files. If there is no file or folder, it means there were no changes between the versions for that particular part of the WT library.Results Click Here