a corpus of 1 million documents that are freely available for research and may be (to the best of our knowledge) freely redistributed. These documents were obtained by performing searches for words randomly chosen from the Unix dictionary, numbers randomly chosen between 1 and 1 million, and randomized combinations of the two, for documents of specified file types that resided on web servers in the .gov domain using the Yahoo an Google search engines.
Aim of the project is to address the challenge of implementing good quality standardised file formats for preserving data content in the long term. The main objective is to give memory institutions full control of the process of the conformity tests of files to be ingested into archives.
The Library of Congress and its digital preservation Partner Tools and Publications from the federal, library, creative, publishing, technology, and copyright communities are working to develop a national strategy to collect, archive, and preserve digital content.
ACE (Auditing Control Environment) is a system that incorporates a new methodology to address the integrity of long term archives using rigorous cryptographic techniques. ACE continuously audits the contents of the various objects according to the policy set by the archive, and provides mechanisms for an independent third-party auditor to certify the integrity of any object.
This is a grid, created by POWRR, that looks at 24 different features, such as ingest, processing, access, storage, maintenance, and cost, for about 50 digital preservation tools. The tools range from simple tools to full digital preservation systems, from ACE to Xena. This tool is very informative.
APARSEN. This is not just another such list. It is an attempt to build an evidence-base for preservation tools, and in particular to try to identify which tools are appropriate for which type of data.
4C will help organisations across Europe to invest more effectively in digital curation and preservation. Research in digital preservation and curation has tended to emphasize the cost and complexity of the task in hand. 4C reminds us that the point of this investment is to realise a benefit, so our research must encompass related concepts such asrisk, value, quality and sustainability. Organizations that understand this will be more able to effectively control and manage their digital assets over time, but they may also be able to create new cost-effective solutions and services for others.
Born Digital: Guidance for Donors, Dealers, and Archival Repositories offers recommendations to help ensure the physical and intellectual well being of digital media and files during different stages of the acquisition process. Co-authored by a team of ten archivists and curators from the Beinecke, the Bodleian, the British Library, the Harry Ransom Center, Emory's Manuscripts, Archives, and Rare Books Library, and the Rubenstein Library at Duke, the report is the outcome of a series of conversations about how born-digital materials are acquired and transferred to archival repositories.
SCAPE is devoted to enhancing the state-of-the-art of long-term digital preservation by developing an infrastructure and tools for scalable preservation actions (the SCAPE Platform and Components), and a framework for automated, quality assured preservation workflows. Additionally, these components will be integrated with a policy-based Pr
Crisis, Tragedy and Recovery network (CTRnet), is a digital library network for providing a range of services relating to different kind of tragic events. Through this digital library, we collect and archive different types of CTR related information such as Web sites, videos, blogs and tweets. Various collections about school shootings and natural disasters have been developed from collaboration with the Internet Archive.