Inproceedings,

Top-N Recommendation Algorithms: A Quest for the State-of-the-Art

V. Anelli, A. Bellog\'ın, T. Noia, D. Jannach, and C. Pomo.
Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization, page 121-131. ACM, (July 2022)
DOI: 10.1145/3503252.3531292

Abstract

Research on recommender systems algorithms, like other areas of applied machine learning, is largely dominated by efforts to improve the state-of-the-art, typically in terms of accuracy measures. Several recent research works however indicate that the reported improvements over the years sometimes “don’t add up”, and that methods that were published several years ago often outperform the latest models when evaluated independently. Different factors contribute to this phenomenon, including that some researchers probably often only fine-tune their own models but not the baselines. In this paper, we report the outcomes of an in-depth, systematic, and reproducible comparison of ten collaborative filtering algorithms—covering both traditional and neural models—on several common performance measures on three datasets which are frequently used for evaluation in the recent literature. Our results show that there is no consistent winner across datasets and metrics for the examined top-n recommendation task. Moreover, we find that for none of the accuracy measurements any of the considered neural models led to the best performance. Regarding the performance ranking of algorithms across the measurements, we found that linear models, nearest-neighbor methods, and traditional matrix factorization consistently perform well for the evaluated modest-sized, but commonly-used datasets. Our work shall therefore serve as a guideline for researchers regarding existing baselines to consider in future performance comparisons. Moreover, by providing a set of fine-tuned baseline models for different datasets, we hope that our work helps to establish a common understanding of the state-of-the-art for top-n recommendation tasks.

BibTeX key: Anelli_2022
entry type: inproceedings
booktitle: Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization
year: 2022
month: jul
pages: 121-131
publisher: ACM
DOI: 10.1145/3503252.3531292
url: https://doi.org/10.1145%2F3503252.3531292

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Conference Paper %1 Anelli_2022 %A Anelli, Vito Walter %A Bellog\'ın, Alejandro %A Noia, Tommaso Di %A Jannach, Dietmar %A Pomo, Claudio %B Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization %D 2022 %I ACM %K evaluation ranking recommender umap2022 %P 121-131 %R 10.1145/3503252.3531292 %T Top-N Recommendation Algorithms: A Quest for the State-of-the-Art %U https://doi.org/10.1145%2F3503252.3531292 %X Research on recommender systems algorithms, like other areas of applied machine learning, is largely dominated by efforts to improve the state-of-the-art, typically in terms of accuracy measures. Several recent research works however indicate that the reported improvements over the years sometimes “don’t add up”, and that methods that were published several years ago often outperform the latest models when evaluated independently. Different factors contribute to this phenomenon, including that some researchers probably often only fine-tune their own models but not the baselines. In this paper, we report the outcomes of an in-depth, systematic, and reproducible comparison of ten collaborative filtering algorithms—covering both traditional and neural models—on several common performance measures on three datasets which are frequently used for evaluation in the recent literature. Our results show that there is no consistent winner across datasets and metrics for the examined top-n recommendation task. Moreover, we find that for none of the accuracy measurements any of the considered neural models led to the best performance. Regarding the performance ranking of algorithms across the measurements, we found that linear models, nearest-neighbor methods, and traditional matrix factorization consistently perform well for the evaluated modest-sized, but commonly-used datasets. Our work shall therefore serve as a guideline for researchers regarding existing baselines to consider in future performance comparisons. Moreover, by providing a set of fine-tuned baseline models for different datasets, we hope that our work helps to establish a common understanding of the state-of-the-art for top-n recommendation tasks.

BibSonomy

Top-N Recommendation Algorithms: A Quest for the State-of-the-Art

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on