Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In this paper we propose the type of Bayesian networks that we call the hierarchical Bayesian network (HBN) classifiers. We present algorithms for the construction of the HBN classifiers and test them on the Reuters text categorization test collection
J. Esparza, and F. Reiter. 31st International Conference on Concurrency Theory (CONCUR 2020), volume 171 of Leibniz International Proceedings in Informatics (LIPIcs), page 10:1--10:16. Dagstuhl, Germany, Schloss Dagstuhl--Leibniz-Zentrum für Informatik, (2020)Preprint: <a href="https://arxiv.org/abs/2007.03291">Link</a><br>#conference.
D. Willems, and L. Vuurpijl. Proceedings of the Ninth international conference on document analysis and recognition, page 869-873. Curitiba, Brazil, (2007)
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, AAAI Technical Report WS-98-05, (1998)
R. Neßelrath, and J. Alexandersson. Proceedings of the 6th IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems. Twenty-First International Joint Conference On Artificial Intelligence (IJCAI -09), in Conjunction with 6th IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems (KRPD-09), July 12, Pasadena, California, United States, page 46-51. IJCAI 2009, (July 2009)