Inproceedings,

Exploiting Hashtags for Adaptive Microblog Crawling

, , , and .
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, page 311--315. New York, NY, USA, ACM, (2013)
DOI: 10.1145/2492517.2492624

Abstract

Researchers have capitalized on microblogging services, such as Twitter, for detecting and monitoring real world events. Existing approaches have based their conclusions on data collected by monitoring a set of pre-defined keywords. In this paper, we show that this manner of data collection risks losing a significant amount of relevant information. We then propose an adaptive crawling model that detects emerging popular hashtags, and monitors them to retrieve greater amounts of highly associated data for events of interest. The proposed model analyzes the traffic patterns of the hashtags collected from the live stream to update subsequent collection queries. To evaluate this adaptive crawling model, we apply it to a dataset collected during the 2012 London Olympic Games. Our analysis shows that adaptive crawling based on the proposed Refined Keyword Adaptation algorithm collects a more comprehensive dataset than pre-defined keyword crawling, while only introducing a minimum amount of noise.

Tags

Users

  • @asmelash
  • @dblp
  • @amitl3s

Comments and Reviews