In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In this post you will see 5 recipes of supervised classification algorithms applied to small standard datasets that are provided with the scikit-learn library.
L. Chen, Y. Cai, Y. Ding, M. Lv, C. Yuan, and G. Chen. Proceedings of the 2016 ACM international joint conference on pervasive and ubiquitous computing, page 1076--1087. (2016)
S. Bloehdorn, and A. Hotho. Proceedings of the MSW 2004 workshop at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, page 70-87. (August 2004)
A. Dargahi Nobari, N. Reshadatmand, and M. Neshati. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, page 2035–2038. New York, NY, USA, Association for Computing Machinery, (2017)
S. Wang, J. Tang, C. Aggarwal, and H. Liu. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, (October 2016)