hOCR is a format for representing OCR output, including layout information, character confidences, bounding boxes, and style information. It embeds this information invisibly in standard HTML. By building on standard HTML, it automatically inherits well-defined support for most scripts, languages, and common layout options. Furthermore, unlike previous OCR formats, the recognized text and OCR-related information co-exist in the same file and survives editing and manipulation. hOCR markup is independent of the presentation.
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
Interaktive, 174-seitige PDF-Version des Schulungshandbuches "Adobe InDesign - clever, verständlich, praxisnah" zum kostenlosen Herunterladen. Das Buch behandelt vorrangig die CS3-Version, ist aber auch für Anwender von InDesign CS2 oder CS4 interessant, da sich die InDesign-Versionen in vielen Punkten kaum unterscheiden. Selbst für Anwender, die die Print-Version bereits gekauft haben, lohnt sich der Download, denn die interaktiven Lesezeichen sowie die Suchfunktion ermöglichen ein schnelles Auffinden von Inhalten.
A simple particle system physics engine for processing. I've designed this to be application / domain agnostic. All this is supposed to do is let you make particles, apply forces and calculate the positions of particles over time in real-time. Anything else you need to handle yourself.