Article,

cycloped.io : an interoperable framework for web knowledge bases

, , , and .
SWJ - Semantic Web Journal, (January 2016)bibtex: smywinskicycloped.

Abstract

The paper introduces cycloped.io – an open-source framework aimed at providing structure for Wikipedia, based on a taxonomy extracted from the Cyc ontology. It discusses the distinctive features of the taxonomy, the methods available in the framework used to map and classify external knowledge bases, several methods employed to classify the articles of the English Wikipedia into Cyc, and the results of the classification. The important features of the taxonomy are: its well defined and stable structure, large number of entity types and predicates useful for entity classification and their description, a number of functions allowing for efficient and automatic processing of its contents, the modularity provided by Cyc’s microtheories, and higher-order entity types. The multiple methods used to classify the Wikipedia entities are based on the Wikipedia category system, the first sentences of articles, the direct mapping of articles to Cyc concepts, and the mapping of DBpedia classes to Cyc types. A new method of classification based on category name patterns is also briefly described. The paper discusses the performance of the individual methods and their combination. Manual validation of the classification shows that it is possible to classify 96\% of the English Wikipedia articles, with precision above 90\%. The paper concludes with an example demonstrating the benefits of a well-defined and stable ontology as a basis for entity classification.

Tags

Users

  • @lepsky

Comments and Reviews