- Ex Libris - DigiTool multi-page entity
- Index to registered METS Profiles
- This is the Platform API documentation for XSLT transformation service. This service supports the application of XSLT 1.0 stylesheets to XML documents.
- Semantic Web technologies for digital preservation: the SPAR project
- Schema for representing OCR results exported from FineReader 8.0 SDK. Copyright 2001-2006 ABBYY, Inc.
- Schema for representing OCR results exported from FineReader 6.0. Copyright 2001-2002 ABBYY, Inc.
- OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natur...OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.
- The DjVuLibre XML Tools provide for editing the metadata, hyperlinks and hidden text associated with DjVu files. Unlike djvused(1) the DjVuLibre XML Tools ...The DjVuLibre XML Tools provide for editing the metadata, hyperlinks and hidden text associated with DjVu files. Unlike djvused(1) the DjVuLibre XML Tools rely on the XML technology and can take advantage of XML editors and verifiers.
- 7train is an XSLT 2.0-based tool for generating Metadata and Encoding Transmission Standard (METS) files from standardized XML inputs.
- METS Tool
- Console
- Use this tool to test a given xpath expression against an Xml Document. This tool can be called Query-Line, with the parameters: xpath, xmlurl and xml.
- Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extra...Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
- Generates a METS file connecting image areas, OCRed text and ground truth documents encoded in TEI xml.
- METS / ALTO technical information


user