Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, KNN, SVM), and data visualization (graph networks).
FOX is a framework that integrates the Linked Data Cloud and makes uses of the diversity of NLP algorithms to extract RDF triples of high accuracy out of NL. In its current version, it integrates and merges the results of Named Entity Recognition, Keyword Extraction and Relation Extraction tools.
D. Lin. Proceedings of the 17th international conference on Computational linguistics
, page 768--774. Morristown, NJ, USA, Association for Computational Linguistics, (1998)
S. Brants, and S. Hansen. In Proceedings of the Third Conference on Language Resources and Evaluation LREC-02. Las Palmas de Gran Canaria
, page 1643--1649. (2002)