In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
C. Henning, und R. Ewerth. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Seite 14--22. New York, NY, USA, ACM, (2017)
S. Bloehdorn, und A. Hotho. Proceedings of the Workshop on Text-based Information Retrieval (TIR-04) at the 27th German Conference on Artificial Intelligence, (September 2004)
X. Zhang, und Y. LeCun. (2015)cite arxiv:1502.01710Comment: This technical report is superseded by a paper entitled "Character-level Convolutional Networks for Text Classification", arXiv:1509.01626. It has considerably more experimental results and a rewritten introduction.
S. Bloehdorn, und A. Hotho. Proceedings of the Workshop on Text-based Information Retrieval (TIR-04) at the 27th German Conference on Artificial Intelligence, (September 2004)
B. Lauser, und A. Hotho. Proc. of the 7th European Conference in Research and Advanced Technology for Digital Libraries, ECDL 2003, Volume 2769 von LNCS, Seite 140-151. Springer, (2003)
S. Bloehdorn, und A. Hotho. Proceedings of the MSW 2004 workshop at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seite 70-87. (August 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the Fourth IEEE International Conference on Data Mining, Seite 331-334. IEEE Computer Society Press, (November 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the Workshop on Text-based Information Retrieval (TIR-04) at the 27th German Conference on Artificial Intelligence, (September 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the MSW 2004 workshop at the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seite 70-87. (August 2004)
S. Bloehdorn, und A. Hotho. Proceedings of the Fourth IEEE International Conference on Data Mining, Seite 331-334. IEEE Computer Society Press, (November 2004)
B. Lauser, und A. Hotho. Proc. of the 7th European Conference in Research and Advanced Technology for Digital Libraries, ECDL 2003, Volume 2769 von LNCS, Seite 140-151. Springer, (2003)
S. Dori-Hacohen, und J. Allan. Proceedings of the 22nd ACM international conference on Conference on information &\#38; knowledge management, Seite 1845--1848. New York, NY, USA, ACM, (2013)
E. Loza Mencía, und J. Fürnkranz. Machine Learning and Knowledge Discovery in Databases, Volume 5212 von Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2008)
E. Loza Mencía, und J. Fürnkranz. Semantic Processing of Legal Texts, Volume 6036 von Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2010)
X. Li, B. Liu, und S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Seite 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
W. Cavnar, und J. Trenkle. Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Seite 161--175. Las Vegas, US, (1994)
C. Rose, A. Roque, D. Bhembe, und K. VanLehn. Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing - Volume 2, Seite 68--75. Stroudsburg, PA, USA, Association for Computational Linguistics, (2003)
S. Feldman, M. Marin, M. Ostendorf, und M. Gupta. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Seite 4781--4784. Washington, DC, USA, IEEE Computer Society, (2009)
X. Phan, L. Nguyen, und S. Horiguchi. WWW '08: Proceeding of the 17th international conference on World Wide Web, Seite 91--100. New York, NY, USA, ACM, (2008)
P. Schonhofen. WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, Seite 456--462. Washington, DC, USA, IEEE Computer Society, (2006)
G. Forman, M. Scholz, und S. Rajaram. KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Seite 299--308. New York, NY, USA, ACM, (2009)
M. Li, Y. Cheng, und H. Zhao. CGIV '04: Proceedings of the International Conference on Computer Graphics, Imaging and Visualization, Seite 183--186. Washington, DC, USA, IEEE Computer Society, (2004)
R. Angelova, und G. Weikum. SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seite 485--492. New York, NY, USA, ACM, (2006)
G. Ifrim, M. Theobald, und G. Weikum. Proceedings of the 22nd International Conference on Machine Learning - Learning in Web Search (LWS 2005), Seite 18--26. Bonn, Germany, (2005)
L. Hirsch, R. Hirsch, und M. Saeedi. GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation, 2, Seite 1604--1611. London, ACM Press, (7-11 July 2007)
Y. Yang, und X. Liu. SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Seite 42--49. New York, NY, USA, ACM Press, (1999)
G. Fung, J. Yu, P. Yu, und H. Lu. VLDB '05: Proceedings of the 31st international conference on Very large data bases, Seite 181--192. VLDB Endowment, (2005)
L. Baker, und A. McCallum. Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, Seite 96--103. Melbourne, AU, ACM Press, New York, US, (1998)
B. Lauser, und A. Hotho. Proc. of the 7th European Conference in Research and Advanced Technology for Digital Libraries, ECDL 2003, Volume 2769 von LNCS, Seite 140-151. Springer, (2003)
Y. Yang, und X. Liu. SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Seite 42--49. New York, NY, USA, ACM Press, (1999)