Abstract
Recommender systems based on collaborative filtering usually require
real-time comparison of users' ratings on objects. In the context
of Web personalization, particularly at the early stages of a visitor's
interaction with the site (i.e., before registration or authentication),
recommender systems must rely on anonymous clickstream data. The
lack of explicit user ratings and the shear amount of data in such
a setting poses serious challenges to standard collaborative filtering
techniques in terms of scalability and performance. Offline clustering
of users transactions can be used to improve the scalability of collaborative
filtering, however, this is often at the cost of reduced recommendation
accuracy. In this paper we study the impact of various preprocessing
techniques applied to clickstream data, suchasclustering, normalization,
and significance filtering, on collaborative filtering. Our experimental
results, performed on real usage data, indicate that with proper
data preparation, the clustering-based approach to collaborative
filtering can achieve dramatic improvements in terms of recommendation
effectiveness, while maintaining the computational advantage over
the direct approaches such as the k-Nearest- Neighbor technique.
Users
Please
log in to take part in the discussion (add own reviews or comments).