Inproceedings,

ADAGIO - Automated Data Augmentation of Knowledge Graphs Using Multi-expression Learning

, , and .
Proceedings of the 33rd ACM Conference on Hypertext and Hypermedia, (2022)
DOI: 10.1145/3511095.3531287

Abstract

The creation of an RDF knowledge graph for a particular application commonly involves a pipeline of tools that transform a set ofinput data sources into an RDF knowledge graph in a process called dataset augmentation. The components of such augmentation pipelines often require extensive configuration to lead to satisfactory results. Thus, non-experts are often unable to use them. Wepresent an efficient supervised algorithm based on genetic programming for learning knowledge graph augmentation pipelines of arbitrary length. Our approach uses multi-expression learning to learn augmentation pipelines able to achieve a high F-measure on the training data. Our evaluation suggests that our approach can efficiently learn a larger class of RDF dataset augmentation tasks than the state of the art while using only a single training example. Even on the most complex augmentation problem we posed, our approach consistently achieves an average F1-measure of 99% in under 500 iterations with an average runtime of 16 seconds

Tags

Users

  • @dice-research

Comments and Reviews