You Shall Not Pass: Detecting Malicious Users at Registration Time
C. Kater, and R. Jäschke. Proceedings of the 1st International Workshop on Online Safety, Trust and Fraud Prevention, page 2:1--2:6. New York, NY, USA, ACM, (June 2016)
Spam is a widespread problem for many online services. The use case in this paper is the social bookmarking system BibSonomy, which received over 150 times more registrations from spam users than from normal users over the last ten years.
A common approach to fight spam is to use machine learning to classify the users into good or malicious users. Based on information the users provide to the service in form of profile information or posts, features are created from which a classifier can make its decision. However, this often means that the accounts of the spam users are already active and can post their spam. In this work we propose an approach for deciding at registration time whether a user is malicious or not. In order to achieve this goal, we extracted 177 features from the information the users provide during the registration process, their IP address, and registration time. With these features we used state-of-the-art classifiers to identify users as spammers or regular users. With the best classifier, we could reach an AUC of 0.912