Abstract

Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging by nature is an incremental process, meaning that once a user has saved a web page with tags, the tagging system can provide more accurate predictions for the user, based on the user’s incremental behavior. However, existing tag prediction methods do not consider this important factor, in which their training and test datasets are either split by a fixed time stamp or randomly sampled from a larger corpus. In our temporal experiments, we perform a time-sensitive sampling on an existing public dataset, resulting in a new scenario which is much closer to “real-world”. In this paper, we address the problem of tag prediction by proposing a probabilistic model for personalized tag prediction. The model is a Bayesian approach, and integrates three factors— an ego-centric effect, environmental effects and web page content. Two methods—both intuitive calculation and learning optimization—are provided for parameter estimation. Pure graphbased methods which may have significant constraints (such as every user, every item and every tag has to occur in at least p posts) cannot make a prediction in most “real world” cases while our model improves the F-measure by over 30% compared to a leading algorithm on a publicly-available real-world dataset.

Links and resources

Tags