In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
A. Dargahi Nobari, N. Reshadatmand, и M. Neshati. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, стр. 2035–2038. New York, NY, USA, Association for Computing Machinery, (2017)
S. Wang, J. Tang, C. Aggarwal, и H. Liu. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, (октября 2016)
L. Hettinger, A. Zehe, A. Dallmann, и A. Hotho. INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft, стр. 191-204. Bonn, Gesellschaft für Informatik e.V., (2019)
L. Wang, Z. Cao, G. de Melo, и Z. Liu. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, стр. 1298--1307. (2016)
R. Girju, P. Nakov, V. Nastase, S. Szpakowicz, P. Turney, и D. Yuret. Proceedings of the 4th International Workshop on Semantic Evaluations, стр. 13--18. Stroudsburg, PA, USA, Association for Computational Linguistics, (2007)
J. Rotsztejn, N. Hollenstein, и C. Zhang. (2018)cite arxiv:1804.02042Comment: Accepted to SemEval 2018 (12th International Workshop on Semantic Evaluation).
D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, и B. Qin. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), стр. 1555--1565. Baltimore, Maryland, Association for Computational Linguistics, (июня 2014)
B. Rink, и S. Harabagiu. Proceedings of the 5th International Workshop on Semantic Evaluation, стр. 256--259. Association for Computational Linguistics, (2010)