Elefant (Efficient Learning, Large-scale Inference, and Optimisation Toolkit) is an open source library for machine learning licensed under the Mozilla Public License (MPL). We develop an open source machine learning toolkit which provides
algorithms for machine learning utilising the power of multi-core/multi-threaded processors/operating systems (Linux, WIndows, Mac OS X),
a graphical user interface for users who want to quickly prototype machine learning experiments,
tutorials to support learning about Statistical Machine Learning (Statistical Machine Learning at The Australian National University), and
detailed and precise documentation for each of the above.
Markov Logic Networks (MLNs) is a powerful framework that combines statistical and logical reasoning; they have been applied to many data intensive problems including information extraction, entity resolution, text mining, and natural language processing. Based on principled data management techniques, Tuffy is an MLN inference engine that achieves scalability and orders of magnitude speedup compared to prior art implementations. It is written in Java and relies on PostgreSQL. For a brief introduction to MLNs and the technical details of Tuffy, please see our technical report.
SUBDUE is a graph-based knowledge discovery system that finds structural, relational patterns in data representing entities and relationships. SUBDUE represents data using a labeled, directed graph in which entities are represented by labeled vertices or subgraphs, and relationships are represented by labeled edges between the entities. SUBDUE uses the minimum description length (MDL) principle to identify patterns that minimize the number of bits needed to describe the input graph after being compressed by the pattern. SUBDUE can perform several learning tasks, including unsupervised learning, supervised learning, clustering and graph grammar learning.
RDF data can be analyzed with various query languages such as SPARQL
or SeRQL. Due to their nature these query languages do not support fuzzy queries.
In this paper we present a new method that transforms the information presented
by subject-relation-object relations within RDF data into Activation Patterns. These
patterns represent a common model that is the basis for a number of sophisticated
analysis methods such as semantic relation analysis, semantic search queries, unsuper-
vised clustering, supervised learning or anomaly detection. In this paper, we explain
the Activation Patterns concept and apply it to an RDF representation of the well
known CIA World Factbook.
PyBrain is a modular Machine Learning Library for Python. Its goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.
PyBrain is short for Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library.
Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, KNN, SVM), and data visualization (graph networks).
Ninth Workshop on Mining and Learning with Graphs will be held in conjunction with the 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining that will take place August 21-24, 2011 in San Diego, CA.
Mloss is a community effort at
producing reproducible research
via open source software, open
access to data and results, and
open standards for interchange.
E. Keogh, S. Lonardi, and C. Ratanamahatana. KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, page 206--215. New York, NY, USA, ACM, (2004)
S. Baluja, D. Ravichandran, and D. Sivakumar. Proceeding of the International Conference on Knowledge Discovery and Information Retrieval (KDIR 2009), INSTICC, (6-8 10 2009)
S. Bloehdorn, and A. Hotho. Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK, page 331-334. IEEE Computer Society, (November 2004)
M. Banko, and E. Brill. ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, page 26--33. Morristown, NJ, USA, Association for Computational Linguistics, (2001)
S. Schoenmackers, O. Etzioni, and D. Weld. EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, page 79--88. Morristown, NJ, USA, Association for Computational Linguistics, (2008)