bookmark

Going Grey? Comparing the OCR Accuracy Levels of Bitonal and Greyscale Images


Description

Newspaper collections are the subject of an increasing number of large-scale digitisation projects. In Papers Past (http://paperspast.natlib.govt.nz), a collection of over a million newspaper pages, the introduction of full-text search has made a wealth of information findable that was previously hidden. The search feature is dependent on text extracted from the newspaper page images with Optical Character Recognition (OCR), so any improvement in OCR accuracy will add value to the collection by improving our users' chances of finding useful information.

Preview

Tags

Users

  • @zeitungsportal

Comments and Reviews