Although term extraction has been researched for more than 20 years, only a few studies focus on under-resourced languages. Moreover, bilingual term mapping from comparable corpora for these languages has attracted researchers only recently. This paper presents methods for term extraction, term tagging in documents, and bilingual term mapping from comparable corpora for four under-resourced languages: Croatian, Latvian, Lithuanian, and Romanian. Methods described in this paper are language independent as long as language specific parameter data is provided by the user and the user has access to a part of speech or a morpho-syntactic tagger.
P. Haase, J. Broekstra, A. Eberhart, und R. Volz. Proceedings of the Third International Semantic Web Conference, Hiroshima, Japan, 2004, 3298, Seite 502-517. Springer Berlin / Heidelberg, (November 2004)
Y. Low, C. Lim, W. Cai, S. Huang, W. Hsu, S. Jain, und S. Turner. Simulation: Transactions of the Society for Computer Simulation (SCS), Joint Special Issue on Parallel and Distributed Simulation, 72 (3):
170-186(1999)
W. Cook, W. Hill, und P. Canning. POPL '90. Proceedings of the seventeenth annual ACM symposium on
Principles of programming languages, January 17--19, 1990, San
Francisco, CA, Seite 125--135. New York, NY, USA, ACM Press, (1990)