The ICIJ’s information totaled more than 260 gigabytes of useful data: one of the biggest collections of leaked data ever gathered and analyzed by a team of investigative journalists.
Users will be able to search for documents by date, topic, person, location, etc. and will be able to do "document dives," collaboratively examining large sets of documents. Think of it as a card catalog for primary source documents. DocumentCloud is not meant to be a general document hosting service, like Scribd, Docstoc or Google Docs. Our goal is to build a service that makes source documents easier to find and share regardless of where they are hosted. It is a complement to these services, and not a competitor. the goal is to make documents even easier to find on search engines. DocumentCloud will have information about documents and relations between them, for example what locations, people, or organizations a group of documents have in common. Conceived of by journalists working at ProPublica and The New York Times, DocumentCloud will be managed as an independent nonprofit.
National Institute for Computer Assisted Reporting (Investigative Reporters and Editors Inc.). A source of data for journalists. data are available only to members of IRE. IRE handles the Freedom of Information hassles in requesting and obtaining the data, transfers the data from various media and formats, and put it in database format that is ready to use, converts date and numeric fields out of their text versions so they are ready for analysis, and make sure you have complete, detailed record layouts, codesheets and other documentation. They also document glitches or flaws in the data and offer suggestions for story ideas and places to find stories that have already been done using this data.
This post is part of our ReadWriteCloud channel, which is dedicated to covering virtualization and cloud computing. The channel is sponsored by Intel and VMware. Read their latest case study: A Canadian Printer Leaves its Servers in Illinois. Thumbnail image for Cirrus_clouds2.jpgIn our last post on data journalism, we ran across a number of tools that would be helpful for anyone who is interested in how to make sense of data. The tools represent a renaissance in how we make sense of our information culture. They provide context and meaning to the often baffling world of big data. This is a snapshot of what is available. We are relying on the work done by Paul Bradshaw, whose blog is an excellent source about the new world of data journalism.