Abstract
Collaborative filtering or recommender systems use a database about user
preferences to predict additional topics or products a new user might like. In
this paper we describe several algorithms designed for this task, including
techniques based on correlation coefficients, vector-based similarity
calculations, and statistical Bayesian methods. We compare the predictive
accuracy of the various methods in a set of representative problem domains. We
use two basic classes of evaluation metrics. The first characterizes accuracy
over a set of individual predictions in terms of average absolute deviation.
The second estimates the utility of a ranked list of suggested items. This
metric uses an estimate of the probability that a user will see a
recommendation in an ordered list. Experiments were run for datasets associated
with 3 application areas, 4 experimental protocols, and the 2 evaluation
metrics for the various algorithms. Results indicate that for a wide range of
conditions, Bayesian networks with decision trees at each node and correlation
methods outperform Bayesian-clustering and vector-similarity methods. Between
correlation and Bayesian networks, the preferred method depends on the nature
of the dataset, nature of the application (ranked versus one-by-one
presentation), and the availability of votes with which to make predictions.
Other considerations include the size of database, speed of predictions, and
learning time.
Users
Please
log in to take part in the discussion (add own reviews or comments).