Determining the user intent of Web searches is a difficult problem
due to the sparse data available concerning the searcher. In this
paper, we examine a method to determine the user intent
underlying Web search engine queries. We qualitatively analyze
samples of queries from seven transaction logs from three different
Web search engines containing more than five million queries.
From this analysis, we identified characteristics of user queries
based on three broad classifications of user intent. The
classifications of informational, navigational, and transactional
represent the type of content destination the searcher desired as
expressed by their query. We implemented our classification
algorithm and automatically classified a separate Web search
engine transaction log of over a million queries submitted by
several hundred thousand users. Our findings show that more than
80% of Web queries are informational in nature, with about 10%
each being navigational and transactional. In order to validate the
accuracy of our algorithm, we manually coded 400 queries and
compared the classification to the results from our algorithm. This
comparison showed that our automatic classification has an
accuracy of 74%. Of the remaining 25% of the queries, the user
intent is generally vague or multi-faceted, pointing to the need to
for probabilistic classification. We illustrate how knowledge of
searcher intent might be used to enhance future Web search
engines.
Description
Determining the user intent of web search engine queries
%0 Conference Paper
%1 paper:jansen:2007
%A Jansen, Bernard J.
%A Booth, Danielle L.
%A Spink, Amanda
%B WWW '07: Proceedings of the 16th international conference on World Wide Web
%C New York, NY, USA
%D 2007
%I ACM
%K intent query search user-intent web
%P 1149--1150
%R http://doi.acm.org/10.1145/1242572.1242739
%T Determining the user intent of web search engine queries
%U http://portal.acm.org/citation.cfm?id=1242739
%X Determining the user intent of Web searches is a difficult problem
due to the sparse data available concerning the searcher. In this
paper, we examine a method to determine the user intent
underlying Web search engine queries. We qualitatively analyze
samples of queries from seven transaction logs from three different
Web search engines containing more than five million queries.
From this analysis, we identified characteristics of user queries
based on three broad classifications of user intent. The
classifications of informational, navigational, and transactional
represent the type of content destination the searcher desired as
expressed by their query. We implemented our classification
algorithm and automatically classified a separate Web search
engine transaction log of over a million queries submitted by
several hundred thousand users. Our findings show that more than
80% of Web queries are informational in nature, with about 10%
each being navigational and transactional. In order to validate the
accuracy of our algorithm, we manually coded 400 queries and
compared the classification to the results from our algorithm. This
comparison showed that our automatic classification has an
accuracy of 74%. Of the remaining 25% of the queries, the user
intent is generally vague or multi-faceted, pointing to the need to
for probabilistic classification. We illustrate how knowledge of
searcher intent might be used to enhance future Web search
engines.
%@ 978-1-59593-654-7
@inproceedings{paper:jansen:2007,
abstract = {Determining the user intent of Web searches is a difficult problem
due to the sparse data available concerning the searcher. In this
paper, we examine a method to determine the user intent
underlying Web search engine queries. We qualitatively analyze
samples of queries from seven transaction logs from three different
Web search engines containing more than five million queries.
From this analysis, we identified characteristics of user queries
based on three broad classifications of user intent. The
classifications of informational, navigational, and transactional
represent the type of content destination the searcher desired as
expressed by their query. We implemented our classification
algorithm and automatically classified a separate Web search
engine transaction log of over a million queries submitted by
several hundred thousand users. Our findings show that more than
80% of Web queries are informational in nature, with about 10%
each being navigational and transactional. In order to validate the
accuracy of our algorithm, we manually coded 400 queries and
compared the classification to the results from our algorithm. This
comparison showed that our automatic classification has an
accuracy of 74%. Of the remaining 25% of the queries, the user
intent is generally vague or multi-faceted, pointing to the need to
for probabilistic classification. We illustrate how knowledge of
searcher intent might be used to enhance future Web search
engines.},
added-at = {2008-10-10T12:49:46.000+0200},
address = {New York, NY, USA},
author = {Jansen, Bernard J. and Booth, Danielle L. and Spink, Amanda},
biburl = {https://www.bibsonomy.org/bibtex/2b9af6a2b649f68e07aca187f6312049d/mschuber},
booktitle = {WWW '07: Proceedings of the 16th international conference on World Wide Web},
description = {Determining the user intent of web search engine queries},
doi = {http://doi.acm.org/10.1145/1242572.1242739},
interhash = {c3de491f7e199f93adc1827ac08df8b4},
intrahash = {b9af6a2b649f68e07aca187f6312049d},
isbn = {978-1-59593-654-7},
keywords = {intent query search user-intent web},
location = {Banff, Alberta, Canada},
pages = {1149--1150},
publisher = {ACM},
timestamp = {2008-10-10T12:49:46.000+0200},
title = {Determining the user intent of web search engine queries},
url = {http://portal.acm.org/citation.cfm?id=1242739},
year = 2007
}