T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
With proper mark-up/logic separation, a POJO data model, and a refreshing lack of XML, Apache Wicket makes developing web-apps simple and enjoyable again.
Markup Language for Temporal and Event Expressions - TimeML is a robust specification language for events and temporal expressions in natural language.
Protégé is a free, open source ontology editor and knowledge-base framework.
The Protégé platform supports two main ways of modeling ontologies via the Protégé-Frames and Protégé-OWL editors. Protégé ontologies can be exported into a variety of formats including RDF(S), OWL, and XML Schema.
Protégé is based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development.
The OntoLT approach aims at a more direct connection between ontology engineering and linguistic analysis. OntoLT is a Protégé plug-in, with which concepts (Protégé classes) and relations (Protégé slots) can be extracted automatically from linguistically annotated text collections. It provides mapping rules, defined by use of a precondition language that allow for a mapping between linguistic entities in text and class/slot candidates in Protégé.
This workshop will gather researchers in a variety of fields that contribute to the automated construction of knowledge bases. It will be held at Xerox Research Centre Europe, near Grenoble (France), May 17-19, 2010.
andLinux runs Linux natively inside Windows. It is a complete Ubuntu Linux system running seamlessly in Windows 2000 based systems (2000, XP, 2003, Vista, 7; 32-bit versions only).
qooxdoo is a comprehensive and innovative framework for creating rich internet applications (RIAs). Leveraging object-oriented JavaScript allows developers to build impressive cross-browser applications. No HTML, CSS nor DOM knowledge is needed.
Our goal is to develop a probabilistic knowledge base that mirrors the content of the web. We are developing a system that uses semi-supervised learning methods to learn to extract symbolic knowledge from unstructured text and HTML. We are exploring methods of continous learning, where our system runs 24x7, continuously learning to read better, and continuously extracting facts from the web.
ConceptNet represents data in the form of a semantic network, and makes it available to be used in natural language processing and intelligent user interfaces.
F. Reichartz, H. Korte, and G. Paass. KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, page 773--782. New York, NY, USA, ACM, (2010)
F. Suchanek, G. Ifrim, and G. Weikum. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), page 712--717. New York, NY, USA, ACM, (2006)
P. Pantel, D. Ravichandran, and E. Hovy. Proceedings of the 20th international conference on Computational Linguistics (COLING-04), page 771--777. Geneva, Switzerland, Association for Computational Linguistics, (2004)
A. Carlson, J. Betteridge, R. Wang, E. Jr., and T. Mitchell. WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, page 101--110. New York, NY, USA, ACM, (2010)
D. Downey, M. Broadhead, and O. Etzioni. Proc. of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI'07), Hyderabad, India, (January 2007)