Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
This collection consists of ~20M web queries collected from ~650k users over three months.
The data is sorted by anonymous user ID and sequentially arranged.
This collection consists of ~20M web queries collected from ~650k users over three months.
The data is sorted by anonymous user ID and sequentially arranged.
A number of resources have been compiled within the context of the MuchMore project. These include: a bilingual, parallel medical corpus; corresponding queries and relevance assessments; evaluation sets of disambiguated terms for GermaNet and UMLS; an evaluation list for morphological decomposition of medical terms.
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)
A. Dulny, A. Hotho, and A. Krause. Machine Learning and Knowledge Discovery in Databases: Research Track, page 438--455. Cham, Springer Nature Switzerland, (2023)