Tweets2011
As part of the TREC 2011 microblog track, Twitter provided identifiers for approximately 16 million tweets sampled between January 23rd and February 8th, 2011. The corpus is designed to be a reusable, representative sample of the twittersphere - i.e. both important and spam tweets are included.
A. Halevy, и J. Madhavan. IJCAI-03, Proceedings of the Eighteenth International Joint Conference
on Artificial Intelligence, Acapulco, Mexico, August 9-15, 2003, стр. 1567-1572. Morgan Kaufmann, (2003)