Inproceedings,

Indexing Historical Documents by Word Shape Signatures

J. Llados, and G. Sanchez.
Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 1, page 362-366. (September 2007)
DOI: 10.1109/ICDAR.2007.4378733

Abstract

In this paper a word spotting approach to index archival image documents is presented. Indices are constructed from keyword images. The spotting strategy is formulated on an indexing-by-shape basis. The well known shape context descriptor is used to compute word image signatures from the skeleton points. Afterwards, codewords are extracted from thresholded shape contexts. It is a simpler and more compact representation based on bit vectors. Document images are roughly segmented into words and a lookup table is constructed. Each word subimage is taken as a bin. Keyword images are spotted into documents by a voting strategy consisting in indexing into the lookup table by codewords, and voting into the corresponding bins. The approach is illustrated by a real application scenario consisting of documents from a digital archive of the Spanish Civil War.

BibTeX key: 4378733
entry type: inproceedings
booktitle: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)
year: 2007
month: Sep.
pages: 362-366
volume: 1
issn: 2379-2140
DOI: 10.1109/ICDAR.2007.4378733
url: http://ieeexplore.ieee.org/abstract/document/4378733

BibSonomy

Indexing Historical Documents by Word Shape Signatures

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on