This collection consists of ~20M web queries collected from ~650k users over three months.
The data is sorted by anonymous user ID and sequentially arranged.
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
E. Žunić, K. Korjenić, S. Delalić, und Z. Šubara. International Journal of Computer Science and Information Technology (IJCSIT), 13 (2):
67 - 84(April 2021)
H. Zhang, A. Santos, und J. Freire. Proceedings of the 30th ACM International Conference on Information &$\mathsemicolon$ Knowledge Management, ACM, (Oktober 2021)
Z. Yang, P. Qi, S. Zhang, Y. Bengio, W. Cohen, R. Salakhutdinov, und C. Manning. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Seite 2369--2380. Brussels, Belgium, Association for Computational Linguistics, (2018)