In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
X. Li, B. Liu, and S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, page 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
W. Cavnar, and J. Trenkle. Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, page 161--175. Las Vegas, US, (1994)
C. Rose, A. Roque, D. Bhembe, and K. VanLehn. Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing - Volume 2, page 68--75. Stroudsburg, PA, USA, Association for Computational Linguistics, (2003)
S. Feldman, M. Marin, M. Ostendorf, and M. Gupta. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, page 4781--4784. Washington, DC, USA, IEEE Computer Society, (2009)
X. Phan, L. Nguyen, and S. Horiguchi. WWW '08: Proceeding of the 17th international conference on World Wide Web, page 91--100. New York, NY, USA, ACM, (2008)
P. Schonhofen. WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, page 456--462. Washington, DC, USA, IEEE Computer Society, (2006)