Although term extraction has been researched for more than 20 years, only a few studies focus on under-resourced languages. Moreover, bilingual term mapping from comparable corpora for these languages has attracted researchers only recently. This paper presents methods for term extraction, term tagging in documents, and bilingual term mapping from comparable corpora for four under-resourced languages: Croatian, Latvian, Lithuanian, and Romanian. Methods described in this paper are language independent as long as language specific parameter data is provided by the user and the user has access to a part of speech or a morpho-syntactic tagger.
In this project, we provide our implementations of CNN [Zeng et al., 2014] and PCNN [Zeng et al.,2015] and their extended version with sentence-level attention scheme [Lin et al., 2016] .
NYT10 is originally released by the paper "Sebastian Riedel, Limin Yao, and Andrew McCallum. Modeling relations and their mentions without labeled text."
T. Rindflesch, J. Rajan, and L. Hunter. Proceedings of the sixth conference on Applied natural language processing, page 188--195. Morristown, NJ, USA, Association for Computational Linguistics, (2000)