In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In this post you will see 5 recipes of supervised classification algorithms applied to small standard datasets that are provided with the scikit-learn library.
M. Ciaramita, T. Hofmann, und M. Johnson. IJCAI-03, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, August 9-15, 2003, Seite 817-822. Morgan Kaufmann, (2003)
A. Mathes. Class report, Computer Mediated Communication - LIS590CMC Graduate School of Library and Information Science University of Illinois Urbana-?Champaign, (Januar 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the MSW 2004 workshop at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seite 70-87. (August 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the Workshop on Text-based Information Retrieval (TIR-04) at the 27th German Conference on Artificial Intelligence, (September 2004)
D. Koller, und M. Sahami. Proceedings of the 14th International Conference on Machine Learning (ML), Nashville, Tennessee,July 1997, Seite 170--178. (1997)