In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In the previous post on Support Vector Machines (SVM), we looked at the mathematical details of the algorithm. In this post, I will be discussing the practical implementations of SVM for classification as well as regression. I will be using the iris dataset as an example for the classification problem, and a randomly generated data as an example for the regression problem.
F. Benevenuto, G. Magno, T. Rodrigues, и V. Almeida. Proceedings of the Seventh Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS), (июля 2010)
F. Benevenuto, G. Magno, T. Rodrigues, и V. Almeida. Proceedings of the Seventh Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS), (июля 2010)
L. Diosan, M. Oltean, A. Rogozan, и J. Pecuchet. GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation, 2, стр. 1873--1873. London, ACM Press, (7-11 July 2007)