In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
P. Schonhofen. WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, стр. 456--462. Washington, DC, USA, IEEE Computer Society, (2006)
X. Phan, L. Nguyen, и S. Horiguchi. WWW '08: Proceeding of the 17th international conference on World Wide Web, стр. 91--100. New York, NY, USA, ACM, (2008)
S. Feldman, M. Marin, M. Ostendorf, и M. Gupta. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, стр. 4781--4784. Washington, DC, USA, IEEE Computer Society, (2009)
C. Rose, A. Roque, D. Bhembe, и K. VanLehn. Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing - Volume 2, стр. 68--75. Stroudsburg, PA, USA, Association for Computational Linguistics, (2003)
W. Cavnar, и J. Trenkle. Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, стр. 161--175. Las Vegas, US, (1994)
X. Li, B. Liu, и S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, стр. 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)