@valexiev

Domain-specific modeling: a Food and Drink Gazetteer

, , and . Transactions on Computational Collective Intelligence XXVI, special issue on Keyword Search in Big Data, volume 10190 of LNCS, page 186-209. Springer, (July 2017)
DOI: 10.1007/978-3-319-59268-8_9

Abstract

Our goal is to build a Food and Drink (FD) gazetteer that can serve for classification of general, FD-related concepts, efficient faceted search or automated semantic enrichment. Fully supervised design of domain-specific models ex novo is not scalable. Integration of several ready knowledge bases is tedious and does not ensure coverage. Completely data-driven approaches require a large amount of training data, which is not always available. For general domains (such as the FD domain), re-using encyclopedic knowledge bases like Wikipedia may be a good idea. We propose here a semi-supervised approach that uses a restricted Wikipedia as a base for the modeling, achieved by selecting a domain-relevant Wikipedia category as root for the model and all its subcategories, combined with expert and data-driven pruning of irrelevant categories.

Links and resources

Tags

community