Releasing the records
A bit less than a year ago, I embarked on a quest to get a copy of the millions of pages of CIA documents stored on CREST, the CIA Records Search Tool. The CREST database was technically publicly available, in the sense that anyone could theoretically use the four computers located in the back of a library that (for budgetary reasons) lacks a librarian for half of the day. These four computers are currently the only ones that can access the CREST database, and they’re only accessible Monday through Friday from 9 Am to 4:30 PM. In other words, most people who aren’t full time researchers can’t use the database even if they’re within driving distance. By printing out and scanning the documents at CIA expense, I was able to begin making them freely available to the public and to give the Agency a financial incentive to simply put the database online. I’m pleased to say that these efforts have been a success, and the Agency is putting the database online.
CIA isn’t doing this out of the goodness of their hearts. Several FOIA requests have been filed for the database, including by the National Security Archive and MuckRock. MuckRock actually sued the Agency with the help of Kel McClanahan of National Security Counselors. The Agency said it would take 28 years to process the files. After some more legal pressure from Mr. McClanahan, the Agency reduced their estimate to six years. This was still too long, and so I began my effort. The hope was that the financial pressure, the negative press and making it not only a legal but a practical inevitability that these files would be put online would force the Agency to speed up their timetable. Thanks to the combined (but uncoordinated) efforts of myself and MuckRock, these files will soon be available.
However, there are some problems with the Agency’s statement about the matter. According to the Agency, the database will retain all of its features when put online. This is extremely unlikely, since the database is currently interfaced through proprietary software known as Laserfiche. This allows for many browsing and search parameters and sorting functions that the Agency’s website simply doesn’t. In many instances, the Agency’s website is simply broken (for instance, users are unable to even go to the second page of their online listing of CREST documents) and users are unable to properly browse the categories of documents already uploaded. It’s more likely that the Agency spokesman was unaware of this and only meant to refer to the text recognition that has been performed on the files. Otherwise, the Agency will have to design or buy an entirely new interface.
To combat this, I’m preparing to reindex and reupload the files in a proper format. While the CREST system at the National Archives has been out of toner for several weeks (CIA has been extremely and deliberately slow in this regard), more toner is expected to arrive this week. This will allow me to retrieve copies of the indexes with the metadata. This will, in turn, be used to organize the files and upload them in a fully searchable format. Assuming the Agency doesn’t retain all of the search functions of the Laserfiche powered system and simply imports the files into their already broken interface, a new database will be built from the files. Several options are being considered in this regard and more than one organization has expressed interest in partnering over it.
What’s in it?
So what’s in the database? There are a little over 775,000 files that make up over 13,000,000 pages that have been declassified as part of the 25-year automatic declassification review period. Before the most recent update of files at the beginning of the year, the database was estimated to be about 840 gigabytes. Breaking these files down into categories, we get:
- Secretary of State Henry Kissinger’s papers: 40,000 pages of newly declassified documents. The papers did not originate with CIA, but “contain many CIA equities.”
- Directorate of Science and Technology R&D: 20,000 pages
- Analytic intelligence publication files: Over 100,000 pages.
- News archives: The Agency collected a lot of news stories about themselves and the subjects they were interested in. Their news archive, much of which is included in CREST, contains many
- Office of the DCI Collection (ODCI): 28,550 documents/129,000 pages from the records of the first five Directors of Central Intelligence: Admiral Roscoe Hillenkoetter, General Walter “Bedell” Smith, Allen Dulles, John McCone, and Richard Helms. These records run from the beginning of CIA in 1947 through the late 1960s and include a wide variety of memos, letters, minutes of meetings, chronologies and related files from the Office of the DCI (ODCI) that document the high level workings of the CIA during its early years.
- Directorate of Intelligence (DI) Central Intelligence Bulletins: 8,800 documents/123,000 pages from a collection of daily Central Intelligence Bulletins (CIB), National Intelligence Bulletins (NIB) and National Intelligence Dailies (NID) running from 1951 through 1979. The CIBs/NIBs were published six days a week (Monday through Saturday) and were all source compilations of articles and consisting initially of short Daily Briefs and longer Significant Intelligence Reports and Estimates on key events and tops of the day. The CIBs/ NIBs were circulated to high level policy-makers in the US Government.
- General CIA Records: Records from the CIA’s archives that are 25 years old or older, including a wide variety of finished intelligence reports, field information reports, high-level Agency policy papers and memoranda, and other documents produced by the CIA.
- STAR GATE: A 25-year Intelligence Community effort that used remote viewers who claimed to use clairvoyance, precognition, or telepathy to acquire and describe information about targets that were blocked from ordinary perception. The records include documentation of remote viewing sessions, training, internal memoranda, foreign assessments, and program reviews.
- Consolidated Translations: Translated reports of foreign-language technical articles of intelligence interest, organized by author and each document covers a single subject.
- Scientific Abstracts: Abstracts of foreign scientific and technical journal articles from around the world.
- Ground Photo Caption Cards: Used to identify photographs in the NlMA ground photograph collection. Each caption card contains a serial number that corresponds to the identical serial number on a ground photograph. The master negatives of the ground photography collection have been accessioned separately to NARA. The caption cards provide descriptive information to help identify which master negatives researchers may wish to request.
- National Intelligence Survey: National Intelligence Survey gazetteers.
- NGA: Records from the National Geospatial-Intelligence Agency, primarily photographic intelligence reports.
- Joint Publication Research Service: Provided translations of regional and topical issues in the late 1970s and early 1980s.
- Office of Strategic Services files: Documents from the OSS, CIA’s World War II predecessor.
While these documents are older, they aren’t irrelevant. One of the CREST documents provided the smoking gun for my expose on an NSA Director sabotaging the NSA.
When will they come out?
The timeline on these files is a little bit sketchy at this point. It’s unlikely that the files will be online before the election. If the Agency is going to keep their word and put the database online, it’s likely that it’ll happen in the next few months. New files are usually added to CREST between January and March, depending on the speed of bureaucracy. It’s unlikely that they’ll add these files to the offline database just before migrating them and hundreds of thousands of other files to an online database.
Work on reconstructing the database with all of the metadata can begin as early as next week. Once the metadata has been and organized work on the database itself can begin. 23,500 CIA documents (340,000 pages) have already been obtained and can be added to the database ASAP. The details of the timeline, however, depend on both when the Agency puts the files online and when potential partners are ready to move forward.