Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In this paper we propose the type of Bayesian networks that we call the hierarchical Bayesian network (HBN) classifiers. We present algorithms for the construction of the HBN classifiers and test them on the Reuters text categorization test collection
M. Croitoru, B. Hu, S. Dasmahapatra, P. Lewis, D. Dupplaw, and L. Xiao. Proceedings of the 15th International Conference on Conceptual Structures (ICCS 2007), volume 4604 of Lecture Notes in Artificial Intelligence, page 140-153. Berlin, Heidelberg, Springer-Verlag, (July 2007)
J. Eggermont, A. Eiben, and J. van Hemert. Proceedings of the Eleventh Belgium/Netherlands
Conference on Artificial Intelligence (BNAIC'99), page 253--254. Kasteel Vaeshartelt, Maastricht, Holland, (3-4 November 1999)
Y. Yang, and J. Pedersen. Proceedings of ICML-97, 14th International Conference on Machine Learning, page 412--420. Nashville, US, Morgan Kaufmann Publishers, San Francisco, US, (1997)
Y. Yang, and J. Pedersen. Proceedings of ICML-97, 14th International Conference on Machine Learning, page 412--420. Nashville, US, Morgan Kaufmann Publishers, San Francisco, US, (1997)