A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization
T. Joachims. Proceedings of ICML-97, 14th International Conference on Machine Learning, page 143--151. Nashville, US, Morgan Kaufmann Publishers, San Francisco, US, (1997)
Abstract
The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorithm is presented in a text categorization framework. The analysis gives theoretical insight into the heuristics used in the Rocchio algorithm, particularly the word weighting scheme and the similarity metric. It also suggests improvements which lead to a probabilistic variant of the Rocchio classifier. The Rocchio...
%0 Conference Paper
%1 citeulike:1711972
%A Joachims, Thorsten
%B Proceedings of ICML-97, 14th International Conference on Machine Learning
%C Nashville, US
%D 1997
%E Fisher, Douglas H.
%I Morgan Kaufmann Publishers, San Francisco, US
%K categorization, feedback, learning, machine, relevance, rocchio, text
%P 143--151
%T A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization
%U http://citeseer.ist.psu.edu/54920.html
%X The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorithm is presented in a text categorization framework. The analysis gives theoretical insight into the heuristics used in the Rocchio algorithm, particularly the word weighting scheme and the similarity metric. It also suggests improvements which lead to a probabilistic variant of the Rocchio classifier. The Rocchio...
@inproceedings{citeulike:1711972,
abstract = {The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorithm is presented in a text categorization framework. The analysis gives theoretical insight into the heuristics used in the Rocchio algorithm, particularly the word weighting scheme and the similarity metric. It also suggests improvements which lead to a probabilistic variant of the Rocchio classifier. The Rocchio...},
added-at = {2008-06-17T16:01:02.000+0200},
address = {Nashville, US},
author = {Joachims, Thorsten},
biburl = {https://www.bibsonomy.org/bibtex/2a57078d6bd4695f6831bedcf09b5ed89/pprett},
booktitle = {Proceedings of ICML-97, 14th International Conference on Machine Learning},
citeulike-article-id = {1711972},
editor = {Fisher, Douglas H.},
interhash = {6458eb4b46e7343e20453784ab487bb2},
intrahash = {a57078d6bd4695f6831bedcf09b5ed89},
keywords = {categorization, feedback, learning, machine, relevance, rocchio, text},
pages = {143--151},
posted-at = {2007-09-30 19:02:18},
priority = {0},
publisher = {Morgan Kaufmann Publishers, San Francisco, US},
timestamp = {2008-06-17T16:02:07.000+0200},
title = {A probabilistic analysis of the Rocchio algorithm with {TFIDF} for text categorization},
url = {http://citeseer.ist.psu.edu/54920.html},
year = 1997
}