FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, KNN, SVM), and data visualization (graph networks).
G. Forman, and E. Kirshenbaum. CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, page 1221--1230. New York, NY, USA, ACM, (2008)
D. Lin. Proceedings of the 17th international conference on Computational linguistics, page 768--774. Morristown, NJ, USA, Association for Computational Linguistics, (1998)
M. Banko, and E. Brill. ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, page 26--33. Morristown, NJ, USA, Association for Computational Linguistics, (2001)
S. Baluja, D. Ravichandran, and D. Sivakumar. Proceeding of the International Conference on Knowledge Discovery and Information Retrieval (KDIR 2009), INSTICC, (6-8 oct 2009)
P. Teufl, and G. Lackner. 10th International Conference on Knowledge Management and Knowledge Technologies 1–3 September 2010, Messe Congress Graz, Austria, page 18 - 18. (2010)