3lib.org, pronounced “freelib”, is a project by the Open Library Society. We improve the access to and the use of freely available scholarly metadata. Most of the records discussed here are already used in the Society’s AuthorClaim service. We are making the records available here for others to use them.
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
Jon Phipps - NSDL Metadata Registry, Cornell University Libraries
An introduction to the Metadata Registry, an open source vocabulary, metadata schema, and DC application profile manager and registry. The Registry provides a bridge between the XML an RDF worlds, providing its output in XML Schema and SKOS/OWL, as well as providing managed namespace services, URI design, permanent URLs with content negotiation, support for multi-user ontology design, change history and version management tools.
This dissertation designs a metadata-driven infrastructure for panel data that aims to increase both the quality and the usability of the resulting research data. Data quality determines whether the data appropriately represent a particular aspect of our reality. Usability originates notably from a conceivable documentation, accessibility of the data, and interoperability with tools and other data sources. In a metadata-driven infrastructure, metadata are prepared before the digital objects and process steps that they describe. This enables data providers to utilize metadata for many purposes, including process control and data validation. Furthermore, a metadata-driven design reduces the overall costs of data production and facilitates the reuse of both data and metadata. The main use case is the German Socio-Economic Panel (SOEP), but the results claim to be re-usable for other panel studies. The introduction of the Generic Longitudinal Business Process Model (GLBPM) and a general discussion of digital objects managed by panel studies provide a generic framework for the development of a metadata-driven infrastructure for panel studies. A first theoretical application presents two designs for variable linkage to support record linkage and statistical matching with structured metadata: concepts for omnidirectional relations and process models for unidirectional relations. Furthermore, a reference architecture for a metadata-driven infrastructure is designed and implemented. This provides a proof of concept for the previous discussion and an environment for the development of DDI on Rails. DDI on Rails is a data portal, optimized for the documentation and dissemination of panel data. The design considers the process model of the GLBPM, the generic discussion of digital objects, the design of a metadata-driven infrastructure, and the proposed solutions for variable linkage.
R. Amorim, J. Castro, J. da Silva, и C. Ribeiro. New Contributions in Information Systems and Technologies, стр. 101--111. Springer International Publishing, (2015)