Domain-specific modeling: a Food and Drink Gazetteer
A. Tagarev, L. Tolosi, and V. Alexiev. Transactions on Computational Collective Intelligence XXVI, special issue on Keyword Search in Big Data, volume 10190 of LNCS, page 186-209. Springer, (July 2017)
DOI: 10.1007/978-3-319-59268-8_9
Abstract
Our goal is to build a Food and Drink (FD) gazetteer that can serve for classification of general, FD-related concepts, efficient faceted search or automated semantic enrichment. Fully supervised design of domain-specific models ex novo is not scalable. Integration of several ready knowledge bases is tedious and does not ensure coverage. Completely data-driven approaches require a large amount of training data, which is not always available. For general domains (such as the FD domain), re-using encyclopedic knowledge bases like Wikipedia may be a good idea. We propose here a semi-supervised approach that uses a restricted Wikipedia as a base for the modeling, achieved by selecting a domain-relevant Wikipedia category as root for the model and all its subcategories, combined with expert and data-driven pruning of irrelevant categories.
%0 Conference Paper
%1 TagarevTolosiAlexiev2017-FD-extended
%A Tagarev, Andrey
%A Tolosi, Laura
%A Alexiev, Vladimir
%B Transactions on Computational Collective Intelligence XXVI, special issue on Keyword Search in Big Data
%D 2017
%E Nguyen, Ngoc Thanh
%E Kowalczyk, Ryszard
%E Pinto, Alexandre Miguel
%E Cardoso, Jorge
%I Springer
%K Cultural_Heritage DBpedia Europeana Wikipedia categorization classification concept_extraction food_and_drink gazetteer semantic_enrichment
%P 186-209
%R 10.1007/978-3-319-59268-8_9
%T Domain-specific modeling: a Food and Drink Gazetteer
%V 10190
%X Our goal is to build a Food and Drink (FD) gazetteer that can serve for classification of general, FD-related concepts, efficient faceted search or automated semantic enrichment. Fully supervised design of domain-specific models ex novo is not scalable. Integration of several ready knowledge bases is tedious and does not ensure coverage. Completely data-driven approaches require a large amount of training data, which is not always available. For general domains (such as the FD domain), re-using encyclopedic knowledge bases like Wikipedia may be a good idea. We propose here a semi-supervised approach that uses a restricted Wikipedia as a base for the modeling, achieved by selecting a domain-relevant Wikipedia category as root for the model and all its subcategories, combined with expert and data-driven pruning of irrelevant categories.
@inproceedings{TagarevTolosiAlexiev2017-FD-extended,
abstract = {Our goal is to build a Food and Drink (FD) gazetteer that can serve for classification of general, FD-related concepts, efficient faceted search or automated semantic enrichment. Fully supervised design of domain-specific models ex novo is not scalable. Integration of several ready knowledge bases is tedious and does not ensure coverage. Completely data-driven approaches require a large amount of training data, which is not always available. For general domains (such as the FD domain), re-using encyclopedic knowledge bases like Wikipedia may be a good idea. We propose here a semi-supervised approach that uses a restricted Wikipedia as a base for the modeling, achieved by selecting a domain-relevant Wikipedia category as root for the model and all its subcategories, combined with expert and data-driven pruning of irrelevant categories.},
added-at = {2021-08-25T16:07:36.000+0200},
author = {Tagarev, Andrey and Tolosi, Laura and Alexiev, Vladimir},
biburl = {https://www.bibsonomy.org/bibtex/22c6f11ab9051c00cef8a01692c8d5004/valexiev},
booktitle = {Transactions on Computational Collective Intelligence XXVI, special issue on Keyword Search in Big Data},
doi = {10.1007/978-3-319-59268-8_9},
editor = {Nguyen, Ngoc Thanh and Kowalczyk, Ryszard and Pinto, Alexandre Miguel and Cardoso, Jorge},
interhash = {f8dc8fcc5042a9240ed54fbfd20deb91},
intrahash = {2c6f11ab9051c00cef8a01692c8d5004},
keywords = {Cultural_Heritage DBpedia Europeana Wikipedia categorization classification concept_extraction food_and_drink gazetteer semantic_enrichment},
month = jul,
pages = {186-209},
publisher = {Springer},
series = {LNCS},
timestamp = {2021-08-25T16:07:36.000+0200},
title = {{Domain-specific modeling: a Food and Drink Gazetteer}},
url_preprint = {http://rawgit2.com/VladimirAlexiev/my/master/pubs/Tagarev2017-DomainSpecificGazetteer.pdf},
url_published = {https://link.springer.com/chapter/10.1007/978-3-319-59268-8_9},
volume = 10190,
year = 2017
}