Semantic tool to access WikiLeaks' Spy Files

03 February, 2014

Note: Silk has been discontinued as of August 2016 and this project is no longer available

Project appeared on multiple news outlets, like The International Business Times, Schneier on Security, Netzpolitik, and Disinformation.

On the 15th of September 2014 WikiLeaks released its last batch of leaked Spy Files – which it had started releasing in 2011. I cleaned and structured the data to build an interactive database combining the four Spy Files releases. Currently, there are 559 leaked company documents, and 15 location tracking reports from WikiLeaks Counter Intelligence Unit (WLCIU). The 559 files disclose to the public internal documents from more than 100 companies specialized in intelligence and (mass) surveillance technologies. These technologies are sold both to Western governments and to dictators, and have been used by the Syrian government. The 15 documents from WLCIU reveal the timestamps and locations of 20 members of these companies, whose whereabouts WikiLeaks has decided to track in order to show where the main surveillance contractors are sending its people.

But what does the Spy Files database actually contain? Which are the most recurring intelligence companies and what systems do they target? How to download exactly the leaked document your research calls for? To answer these questions, I’ve decided to import WikiLeaks’s DB into Silk, to combine it with semantic technologies, a powerful query engine and a user-friendly interactive visualization interface.