First, some key points for clarity:
- The WikiLeaks insurance file was created before any of the files DNC Leak were released by WikiLeaks or Guccifer 2.0, therefore it’s assumed that the insurance file includes all of the material Guccifer 2.0 said he gave to WikiLeaks.
- All totals are being given in the number of bytes. This is meant to avoid confusion about whether totals are counted in base-8 (1,024 bytes = 1 KB) or base-10 (1,000 bytes = 1 KB).
- Although precise byte numbers are used when possible, there are still variables in the file sizes that cannot be fully accounted for. These include the possibility of file size being decreased by compression, increased by encryption or increased by adding deniability blocks which can obscure the total amount of information being encrypted.
- The assumption will be that changes to the file size resulting from compression and encryption will cancel each other out. This is an assumption, based on Occam’s razor and generalizations about file encryption and compression. The numbers should be considered back-of-the-envelope calculations.
- File numbers and sizes are calculated by Finder in the latest version of OS X, including folders. All files are uncompressed, unless they were compressed in their original presentation (i.e. HRC.zip).
- Due to the timing, the assumption is that the latest insurance file deals mostly with the DNC leaks. However, there is no reason to believe that WikiLeaks would not have included other information that is pending publication or being reviewed by WikiLeaks. Therefore, the percentages refer to the insurance file and not necessarily DNC/DCCC/Hillary/etc. leak material.
- This also assumes that Guccifer 2.0 is, in fact, the source of the materials given to WikiLeaks. However, WikiLeaks has not confirmed this and as a matter of policy does not comment on their sources in any fashion. This may be a fairly safe assumption, but it is an assumption.
Total insurance file size: 94,087,593,304 bytes.
Total released DNC emails file size: 1,738,581,029 bytes.
Total DNC email attachments: 713,283,588 bytes.
Guccifer files released to public: 125,050,273 bytes.
Guccifer files sent to Joe Uchill: 39,129,817 bytes
Total confirmed files: 1,902,761,119 bytes.
Confirmed percentage of insurance file: 2.022%
Total confirmed files, with separate attachments: 2,616,044,707 bytes.
Confirmed percentage of insurance file, with separate attachments: 2.78%
Thomson Reuters photos: 2,233,365,050 bytes.
Total identified potential files: 4,175,255,986 bytes.
Potential percentage of insurance file: 4.396%
Potentially identified, but unreleased:
DNC/DCCC photos (estimate): 225,000,000 bytes (high end)
Total estimated potential files: 2,002,710,846 bytes
Estimated percentage of insurance file: 2.128%
Total estimated potential files, with Thomson Reuters: 4,236,075,896 bytes
Estimated percentage of insurance file, with Thomson Reuters: 4.502%
Total estimated potential files, with Thomson Reuters, DNC/DCCC and separate email attachments: 4,949,359,484 bytes
Estimated percentage of insurance file, with Thomson Reuters, DNC/DCCC and separate email attachments: 5.26%
Does the insurance file include everything Guccifer 2.0 sent to WikiLeaks?
Probably. In an email to Gawker, Guccifer said that he had “about 100 Gb of data including financial reports, donors’ lists, election programs, action plans against Republicans, personal mails, etc.” Rounding 94,087,593,304 up to 100 GB is very reasonable. The insurance file probably also includes the materials published directly by Guccifer.
What about the size of the voice mails?
They’re included in the emails as attached files.
What if the email attachments were included in the native emails and as separate files?
The email attachments come to 713,283,588 bytes if they were sent as separate files and embedded in the .eml files.
What about the Thomson Reuters photos?
It’s unknown if the Thomson Reuters photos were included in the files Guccifer 2.0 gave to WikiLeaks, but they were exposed in the DNC hack. The 429 photos come to 2,233,365,050 bytes.
What about the DNC photos?
It’s unknown if the DNC/DCCC photos were included in Guccifer 2.0’s files, or how many pictures there were. In addition to the Thomson Reuters photos, Guccifer 2.0’s hack gave access to a photos from at least three DNC/DCCC events. That service took photos for at least twelve private DNC and DCCC events in the time period that the hack covered. The service offered high resolution pictures, typically 1600 × 1088. The sample watermarked images (from other events) are at levels around 450,000 bytes each (these are the watermarked sample photographs provided as a preview before ordering the professional quality print or download). It’s unknown how many pictures were included in each set, but there are three likely ranges.
- Small – 100 photos: 45,000,000 bytes
- Medium – 250 photos: 112,500,000 bytes
- Large – 500 photos: 225,000,000 bytes
What about the DCCC files?
The issue of the DCCC files is a little bit tricky. It’s easy to assume that the DCCC would have been included in the original batch of files sent to WikiLeaks, but this isn’t consistent with what Guccifer 2.0 said. On August 12th, after the first batch of DNC emails were published on WikiLeaks and long after the insurance file was published, Guccifer tweeted that he was sending the DCCC files to WikiLeaks. If that date is true and not meant to be misleading, then the DCCC files are not included in the WikiLeaks insurance file.