This collection consists of ~20M web queries collected from ~650k users over three months.
The data is sorted by anonymous user ID and sequentially arranged.
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usu
F. Bogo, J. Romero, M. Loper, and M. Black. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, page 3794--3801. Washington, DC, USA, IEEE Computer Society, (2014)
B. Fetahu, U. Gadiraju, and S. Dietze. Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014., page 433--436. (2014)