In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In the previous post on Support Vector Machines (SVM), we looked at the mathematical details of the algorithm. In this post, I will be discussing the practical implementations of SVM for classification as well as regression. I will be using the iris dataset as an example for the classification problem, and a randomly generated data as an example for the regression problem.
S. Kiritchenko, X. Zhu, C. Cherry, and S. Mohammad. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)
, page 437--442. Dublin, Ireland, Association for Computational Linguistics, (August 2014)