Our mission is to make data about San Diego freely available for everyone to use. Data about San Diego are any data that describe San Diego in any way. We don't care where the data comes from, whether it's from the city government, federal sources, or any other organization that gathers data. We also want to highlight the importance of knowing how to use data. The data science industry is growing rapidly, and San Diego's economy is well-positioned to benefit from this growth. One thing we want to do at Open San Diego is shine a spotlight on data scientists and show how important and exciting their work can be.
National Archives and Records Administration. The U.S. National Archives and Records Administration (NARA) has 149 files of the Censuses of Manufactures, 1972, 1977, 1982, 1987, and 1992. You can find the series description, as well as the file descriptions, in NARA's Archival Research Catalog (ARC), www.archives.gov/research/arc. The ARC ID for the series description is 574852, and it can be used as the keyword to retrieve the description.
Zanran helps you to find ‘semi-structured’ data on the web. This is the numerical data that people have presented as graphs and tables and charts. For example, the data could be a graph in a PDF report, or a table in an Excel spreadsheet, or a barchart shown as an image in an HTML page. Put more simply: Zanran is Google for data. At present, we extract tables and images from HTML, PDF and Excel files and will be processing PowerPoint and Word documents in the near future.
DataCatalogs.org aims to be the most comprehensive list of open data catalogs in the world. It is curated by a group of leading open data experts from around the world - including representatives from local, regional and national governments, international organisations such as the World Bank, and numerous NGOs. see also The Data Hub, CKAN, and the Open Knowledge Foundation.
the Data Hub is a community-run catalogue of useful sets of data on the Internet. You can collect links here to data from around the web for yourself and others to use, or search for data that others have collected. Depending on the type of data (and its conditions of use), the Data Hub may also be able to store a copy of the data or host it in a database, and provide some basic visualisation tools. This site runs on the open-source data cataloguing software called CKAN, written and maintained by the Open Knowledge Foundation. Each 'dataset' record on CKAN contains a description of the data and other useful information, such as what formats it is available in, who owns it and whether it is freely available, and what subject areas the data is about. Other users can improve or add to this information (CKAN keeps a fully versioned history). CKAN powers a number of data catalogues on the Internet. The Data Hub is an openly editable open data catalogue in the style of Wikipedia.
Roughly the specification consists of 2 parts: 1. A schema (in essence DCAT) specifying a serialization of Dataset information, and 2. A protocol / API for getting this information from a compliant data catalogue site.
tool for helping people identify and locate online repositories of research data. Over 200 data repositories have been cataloged in Databib, with more being added every week. Users and bibliographers create and curate records that describe data repositories that users can browse and search.