In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
Our flagship collection, under development since 1987, covers the history, literature and culture of the Greco-Roman world. We are applying what we have learned from Classics to other subjects within the humanities and beyond.
M. Hearst. Proceedings of the 14th Conference on Computational Linguistics - Volume 2, page 539--545. Stroudsburg, PA, USA, Association for Computational Linguistics, (1992)
T. Kenter, and M. de Rijke. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, page 1411--1420. New York, NY, USA, ACM, (2015)