Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization" [1]. It was primarily developed for language guessing, a task on which it is known to perform with near-pe
Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document
"The future co-existence of controlled vocabularies and collaborative tagging is predicted, with each appropriate for use within distinct information contexts: formal and informal."
W. Martins, M. Goncalves, A. Laender, und G. Pappa. Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, Seite 193--202. New York, NY, USA, ACM, (2009)