The Data-gov Wiki is a project being pursued in the Tetherless World Constellation at Rensselaer Polytechnic Institute. We are investigating open government datasets using semantic web technologies. Currently, we are translating such datasets into RDF, getting them linked to the linked data cloud, and developing interesting applications and demos on linked government data. Most of the datasets shown on this page come from the US government's data.gov Web site, although some are from other countries or non-government sources.
DigitalCorpora.org is a website of digital corpora for use in computer forensics education research. All of the disk images, memory dumps, and network packet captures available on this website are freely available and may be used without prior authorization or IRB approval. We also have available a research corpus of real data acquired from around the world.
a corpus of 1 million documents that are freely available for research and may be (to the best of our knowledge) freely redistributed. These documents were obtained by performing searches for words randomly chosen from the Unix dictionary, numbers randomly chosen between 1 and 1 million, and randomized combinations of the two, for documents of specified file types that resided on web servers in the .gov domain using the Yahoo an Google search engines.
Robert Steele, the longtime proponent of a robust open source intelligence program, has a web site which notably includes this archive of intelligence-policy related documents.
odd, obsucred links to online books including gutenberg, and government books but also, perhaps, pirated copies of copyrighted books? over 100,000 free online books and ebooks. All books are sorted and categorized automatically by a computer program. Some of them might have been misplaced in a wrong category.
Discover the data behind the Department of Energy's scientific publications. Use the DOE Data Explorer (DDE) to find scientific research data - such as computer simulations, numeric data files, figures and plots, interactive maps, multimedia, and scientific images - generated in the course of DOE-sponsored research in various science disciplines.