- Short introduction to Vector Space Model (VSM) In information retrieval or text mining, the term frequency - inverse document frequency also called tf-idf,...Short introduction to Vector Space Model (VSM) In information retrieval or text mining, the term frequency - inverse document frequency also called tf-idf, is
- As a Google user, you're familiar with the speed and accuracy of a Google search. How exactly does Google manage to find the right results for every query ...As a Google user, you're familiar with the speed and accuracy of a Google search. How exactly does Google manage to find the right results for every query as quickly as it does? The heart of Google's search technology is PigeonRank™, a system for ranking web pages developed by Google founders Larry Page and Sergey Brin at Stanford University.
- Its so fun to write oversimplified posts about such-and-such is dead. Not because its true. At best you can point out something is broken and alternatives ...Its so fun to write oversimplified posts about such-and-such is dead. Not because its true. At best you can point out something is broken and alternatives are rising fast. But I wonder how the people behind the aging technology and...
- How does the web search behavior of ``rich'' and ``poor'' people differ? Do men and women tend to click on different results for the same query? What are s...How does the web search behavior of ``rich'' and ``poor'' people differ? Do men and women tend to click on different results for the same query? What are some queries almost exclusively issued by African Americans? These are some of the questions we address in this study. Our research combines three data sources: the query log of a major US-based web search engine, profile information provided by 28 million of its users (birth year, gender and zip code), and US-census information including detailed demographic information aggregated at the level of ZIP code. Through this combination we can annotate each query with, e.g., the average per-capita income in the ZIP code it originated from. Though conceptually simple, this combination immediately creates a powerful demographic profiling tool. The main contributions of this work are the following. First, we provide a demographic description of a large sample of search engine users in the US and show that it agrees well with the distribution of the US population. Second, we describe how different segments of the population differ in their search behavior, e.g. with respect to the diversity of formulated queries or with respect to the clicked URLs. Third, we explore applications of our methodology to improve web search and, in particular, to help issuing query reformulations. These results enable the creation of a powerful tool for improved user modeling in practice, with many applications including improving web search and advertising. For instance, advertisements for ``family vacations'' could be adapted to the (expected) income of the person issuing the query, or search suggestions shown to users could be adapted to items that are more interesting given their particular characteristics.
- Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitema...Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site. Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.
- A Fast String Searching Algorithm, with R.S. Boyer. Communications of the Association for Computing Machinery, 20(10), 1977, pp. 762-772.
- Use pictures to search the web. A picture is worth a thousand words. No need to type your search anymore. Just take a picture. Find out what business...Use pictures to search the web. A picture is worth a thousand words. No need to type your search anymore. Just take a picture. Find out what businesses are nearby.Just point your phone at a store. This is just the beginning - it's not quite perfect yet.Works well for some things, but not for all. Your pictures, your control. Turn on 'visual search history' to view or share your pictures at any time. Turn it off to discard them once the search is done.
- Web search engines have changed our lives - enabling instant access to information about subjects that are both deeply important to us, as well as passing ...Web search engines have changed our lives - enabling instant access to information about subjects that are both deeply important to us, as well as passing whims. The search engines that provide answers to our search queries also log those queries, in order to improve their algorithms. Academic research on search queries has shown that they can provide valuable information on diverse topics including word and phrase similarity, topical seasonality and may even have potential for sociology, as well as providing a barometer of the popularity of many subjects. At the same time, individuals are rightly concerned about what the consequences of accidental leaking or deliberate sharing of this information may mean for their privacy. In this talk I will cover the applications which have benefited from mining query logs, the risks that privacy can be breached by sharing query logs, and current algorithms for mining logs in a way to prevent privacy breaches.
- Über das nachfolgende Formular können Sie die Datenbank aller in Deutschland zugelassenen Rechtsanwältinnen und Rechtsanwälte sowie der in Deutschland zuge...Über das nachfolgende Formular können Sie die Datenbank aller in Deutschland zugelassenen Rechtsanwältinnen und Rechtsanwälte sowie der in Deutschland zugelassenen europäischen Rechtsanwältinnen und Rechtsanwälte durchsuchen.
- Today, we're pleased to announce the launch of Web History
- * Don't include this document in the Google search results: X-Robots-Tag: noindex
- Earth allows you to find files across a large network of machines and track disk usage in real time. It consists of a daemon that indexes filesystems in re...Earth allows you to find files across a large network of machines and track disk usage in real time. It consists of a daemon that indexes filesystems in real time and reports all the changes back to a central database. This can then be queried through a simple, yet powerful, web interface. Think of it like Spotlight or Beagle but operating system independent with a central database for multiple machines with a web application that allows novel ways of exploring your data.
- Diese Webseite sucht Bücher direkt von Amazon mit dem zur Verfügung gestellten Amazon-Webservice und extrahiert hieraus BIB-Daten für Bibtex/Latex.
- (SRU: Search and Retrieve via URL - Standards, Library of Congress)
- Proceedings of the fifth ACM conference on Recommender systems, page 45--52. New York, NY, USA, ACM, (2011)
- Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, page 451--458. New York, NY, USA, ACM, (2008)
- INFOCOM, 2010 Proceedings IEEE, page 1--5. IEEE, (March 2010)
- 2003-35. Stanford InfoLab, Stanford, (June 2003)
- Proceedings of the 12th international conference on Intelligent user interfaces, page 52--61. New York, NY, USA, ACM, (2007)
- Proceedings of the 2008 ACM conference on Computer supported cooperative work, page 485--494. New York, NY, USA, ACM, (2008)
- HT '08: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia, page 157--166. New York, NY, USA, ACM, (2008)
- JCDL '07: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, page 107--116. New York, NY, USA, ACM, (2007)
- CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, page 73--82. New York, NY, USA, ACM, (2008)
- ACM Trans. Inf. Syst. 25(2):7 (2007)
- SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, page 154--161. New York, NY, USA, ACM, (2005)
- WWW '06: Proceedings of the 15th international conference on World Wide Web, page 543--552. New York, NY, USA, ACM, (2006)
- CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge management, page 118--126. New York, NY, USA, ACM, (2004)
- SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, page 35--42. New York, NY, USA, ACM, (2009)
- Communications of the ACM 20(10):762--772 (October 1977)
- Technical Report, 2003-29. Stanford InfoLab, (2003)
- Proceedings of the Second International Conference on Weblogs and Social Media ICWSM 2008, page 192--193. Menlo Park, CA, USA, AAAI Press, (2008)
- ACM Transactions on Internet Technology 5(1):92--128 (2005)
- Internet Mathematics 1(3):335--380 (2004)
- Technical Report, 1999-66. Stanford InfoLab, (November 1999)


user