Incollection,

Evaluation of Alignment Methods for HTML Parallel Text

, , , and .
Advances in Natural Language Processing, volume 4139 of Lecture Notes in Computer Science, Springer, Berlin / Heidelberg, (2006)

Abstract

The Internet constitutes a potential huge store of parallel text that may be collected to be exploited by many applications such as multilingual information retrieval, machine translation, etc. These applications usually require at least sentence-alignedbilingual text. This paper presents new aligners designed for improving the performance of classical sentence-level alignerswhile aligning structured text such as HTML. The new aligners are compared with other well-known geometric aligners.

Tags

Users

  • @unhammer

Comments and Reviews