Zanran helps you to find ‘semi-structured’ data on the web. This is the numerical data that people have presented as graphs and tables and charts. For example, the data could be a graph in a PDF report, or a table in an Excel spreadsheet, or a barchart shown as an image in an HTML page. Put more simply: Zanran is Google for data. At present, we extract tables and images from HTML, PDF and Excel files and will be processing PowerPoint and Word documents in the near future.
The objectives of this initiative are to establish easier access to scientific research data on the Internet, to increase acceptance of research data as legitimate, citable contributions to the scientific record, and to support data archiving that will permit results to be verified and re-purposed for future study. DataCite will promote data sharing, increased access, and better protection of research investment.
Governments around the globe are opening up their data vaults – allowing you to check out the numbers for yourself. This is the Guardian’s gateway to that information. Search for government data here from the UK (including London), USA, Australia and New Zealand – and look out for new countries and places as we add them.
The Nucleic Acids Research online Molecular Biology Database Collection is a public repository that lists more than 1000 databases described in this and previous Nucleic Acids Research annual database issues, as well as a selection of molecular biology databases described in other journals. All databases included in this Collection are freely available to the public. The 2008 update includes 1078 databases, 110 more than the previous one. The links to more than 80 databases have been updated and 25 obsolete databases have been removed from the list. The complete database list and summaries are available online at the Nucleic Acids Research web site, http://nar.oxfordjournals.org/.
the Data Hub is a community-run catalogue of useful sets of data on the Internet. You can collect links here to data from around the web for yourself and others to use, or search for data that others have collected. Depending on the type of data (and its conditions of use), the Data Hub may also be able to store a copy of the data or host it in a database, and provide some basic visualisation tools. This site runs on the open-source data cataloguing software called CKAN, written and maintained by the Open Knowledge Foundation. Each 'dataset' record on CKAN contains a description of the data and other useful information, such as what formats it is available in, who owns it and whether it is freely available, and what subject areas the data is about. Other users can improve or add to this information (CKAN keeps a fully versioned history). CKAN powers a number of data catalogues on the Internet. The Data Hub is an openly editable open data catalogue in the style of Wikipedia.
REACTOME is a free, online, open-source, curated pathway database encompassing many areas of human biology. Information is authored by expert biological researchers, maintained by the Reactome editorial staff and cross-referenced to a wide range of standard biological databases.
created and maintained by Paul Hensel of the Department of Political Science at Florida State University. This site includes links to on-line data resources on the most useful data sources on international conflict and cooperation, international economic, environmental, political, and social data and data on similar topics for the United States.