XOXO (eXtensible Open XHTML Outlines) is an XML microformat for outlines built on top of XHTML. Developed by several authors as an attempt to reuse XHTML building blocks instead of inventing unnecessary new XML elements/attributes, XOXO is based on existing conventions for publishing outlines, lists, and blogrolls on the Web. The XOXO specification defines an outline as a hierarchical, ordered list of arbitrary elements. The specification is fairly open which makes it suitable for many types of list data. E.g. the more semantic version of the S5 presentation file format is based upon XOXO.
hOCR is a format for representing OCR output, including layout information, character confidences, bounding boxes, and style information. It embeds this information invisibly in standard HTML. By building on standard HTML, it automatically inherits well-defined support for most scripts, languages, and common layout options. Furthermore, unlike previous OCR formats, the recognized text and OCR-related information co-exist in the same file and survives editing and manipulation. hOCR markup is independent of the presentation.
OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities. This server allows you to use the system through your web browser.
hResume is a microformat for publishing résumé or Curriculum Vitae (CV) information [1] using (X)HTML on web pages. Like many other microformats, hResume uses CSS class names to make an otherwise non-semantic XHTML document more meaningful. A document containing resume information could be improved to use hResume without altering the appearance to the browser, making it easy to adopt.
Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging).