3lib.org, pronounced “freelib”, is a project by the Open Library Society. We improve the access to and the use of freely available scholarly metadata. Most of the records discussed here are already used in the Society’s AuthorClaim service. We are making the records available here for others to use them.
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
Jon Phipps - NSDL Metadata Registry, Cornell University Libraries
An introduction to the Metadata Registry, an open source vocabulary, metadata schema, and DC application profile manager and registry. The Registry provides a bridge between the XML an RDF worlds, providing its output in XML Schema and SKOS/OWL, as well as providing managed namespace services, URI design, permanent URLs with content negotiation, support for multi-user ontology design, change history and version management tools.
This dissertation designs a metadata-driven infrastructure for panel data that aims to increase both the quality and the usability of the resulting research data. Data quality determines whether the data appropriately represent a particular aspect of our reality. Usability originates notably from a conceivable documentation, accessibility of the data, and interoperability with tools and other data sources. In a metadata-driven infrastructure, metadata are prepared before the digital objects and process steps that they describe. This enables data providers to utilize metadata for many purposes, including process control and data validation. Furthermore, a metadata-driven design reduces the overall costs of data production and facilitates the reuse of both data and metadata. The main use case is the German Socio-Economic Panel (SOEP), but the results claim to be re-usable for other panel studies. The introduction of the Generic Longitudinal Business Process Model (GLBPM) and a general discussion of digital objects managed by panel studies provide a generic framework for the development of a metadata-driven infrastructure for panel studies. A first theoretical application presents two designs for variable linkage to support record linkage and statistical matching with structured metadata: concepts for omnidirectional relations and process models for unidirectional relations. Furthermore, a reference architecture for a metadata-driven infrastructure is designed and implemented. This provides a proof of concept for the previous discussion and an environment for the development of DDI on Rails. DDI on Rails is a data portal, optimized for the documentation and dissemination of panel data. The design considers the process model of the GLBPM, the generic discussion of digital objects, the design of a metadata-driven infrastructure, and the proposed solutions for variable linkage.
As it is often the case for social software services, online reference managers are becoming powerful and costless solutions to collect large sets of metadata, in this case collaborative metadata on scientific literature.
Online reference managers are extraordinary productivity tools, but it would be a mistake to take this as their primary interest for the academic community. As it is often the case for social software services, online reference managers are becoming power
ALTO (Analyzed Layout and Text Object) is a XML Schema that details technical metadata for describing the layout and content of physical text resources, such as pages of a book or a newspaper. It most commonly serves as an extension schema used within the Metadata Encoding and Transmission Schema (METS) administrative metadata section. However, ALTO instances can also exist as a standalone document used independently of METS.
Annotea is a W3C Semantic Web Advanced Development project that provides a framework for rich communication about Web pages through shared RDF metadata. An RDF model of bookmark classification permits multiple classification systems to be related to each
Annotea is a W3C Semantic Web Advanced Development project that provides a framework for rich communication about Web pages through shared RDF metadata. An RDF model of bookmark classification permits multiple classification systems to be related to each
R. Amorim, J. Castro, J. da Silva, and C. Ribeiro. New Contributions in Information Systems and Technologies, page 101--111. Springer International Publishing, (2015)