In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
X. Li, B. Liu, and S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, page 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
G. Forman, M. Scholz, and S. Rajaram. KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, page 299--308. New York, NY, USA, ACM, (2009)