A number of resources have been compiled within the context of the MuchMore project. These include: a bilingual, parallel medical corpus; corresponding queries and relevance assessments; evaluation sets of disambiguated terms for GermaNet and UMLS; an evaluation list for morphological decomposition of medical terms.
MALLET is an integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text.
I'm interested in machine learning techniques (graphical models, kernel methods) applied to text understanding (entity and relation extraction, coreference resolution, document classification and clustering, confidence prediction, social network analysis, data mining).
J. Zhang, Y. Dong, Y. Wang, J. Tang, and M. Ding. Proceedings of the 28th International Joint Conference on Artificial Intelligence, page 4278–4284. AAAI Press, (Aug 10, 2019)