Building and operating large-scale information retrieval systems used by hundreds of millions of people around the world provides a number of interesting challenges. Designing such systems requires making complex design tradeoffs in a number of dimensions, including (a) the number of user queries that must be handled per second and the response latency to these requests, (b) the number and size of various corpora that are searched, (c) the latency and frequency with which documents are updated or added to the corpora, and (d) the quality and cost of the ranking algorithms that are used for retrieval. In this talk I'll discuss the evolution of Google's hardware infrastructure and information retrieval systems and some of the design challenges that arise from ever-increasing demands in all of these dimensions. I'll also describe how we use various pieces of distributed systems infrastructure when building these retrieval systems. Finally, I'll describe some future challenges and open research problems in this area.
Z. Cheng, J. Caverlee, and K. Lee. Proceedings of the 19th ACM International Conference on Information and Knowledge Management, page 759--768. New York, NY, USA, ACM, (2010)
M. Strube, and S. Ponzetto. Proceedings of the National Conference on Artificial Intelligence, 21, page 1419. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, (2006)
O. Sunercan, and A. Birturk. AAAI Spring Symposium on Linked Data Meets Artificial Intelligence (Linked AI 2010), ser. AAAI Spring Symposium, AS Symposium, Ed., Stanford, USA, (2010)
R. Mihalcea, and A. Csomai. Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, page 233--242. New York, NY, USA, ACM, (2007)
R. Mihalcea, and A. Csomai. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, page 233--242. New York, NY, USA, ACM, (2007)
Z. Cheng, J. Caverlee, H. Barthwal, and V. Bachani. Proceedings of the 37th International ACM SIGIR Conference on Research &\#38; Development in Information Retrieval, page 335--344. New York, NY, USA, ACM, (2014)
B. Bi, B. Kao, C. Wan, and J. Cho. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1506--1515. New York, NY, USA, ACM, (2014)
O. Tsur, and A. Rappoport. Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, page 643--652. New York, NY, USA, ACM, (2012)
A. Olteanu, S. Vieweg, and C. Castillo. Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work &\#38; Social Computing, page 994--1009. New York, NY, USA, ACM, (2015)
H. Kwak, C. Lee, H. Park, and S. Moon. Proceedings of the 19th International Conference on World Wide Web, page 591--600. New York, NY, USA, ACM, (2010)