Automated News Suggestions for Populating Wikipedia Entity Pages
B. Fetahu, K. Markert, and A. Anand. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, Melbourne, Australia, (2015)
Abstract
Wikipedia entity pages are a valuable source of information for direct
consumption and for knowledge-base construction, update and
maintenance. Facts in these entity pages are typically supported by
references. Recent studies show that as much as 20% of the references are from online news sources. However, many entity pages
are incomplete even if relevant information is already available in
existing news articles. Even for the already present references,
there is often a delay between the news article publication time and
the reference time. In this work, we therefore look at Wikipedia
through the lens of news and propose a novel news-article suggestion task to improve news coverage in Wikipedia, and reduce the
lag of newsworthy references. Our work finds direct application, as
a precursor, to Wikipedia page generation and knowledge-base acceleration
tasks that rely on relevant and high quality input sources.
We propose a two-stage supervised approach for suggesting news
articles to entity pages for a given state of Wikipedia. First, we suggest news articles to Wikipedia entities (article-entity placement)
relying on a rich set of features which take into account the
salience and relative authority of entities, and the
novelty of news articles to entity pages. Second, we determine the exact section in the entity page for the input article (article-section placement) guided
by class-based section templates. We perform an extensive evaluation of our approach based on ground-truth data that is extracted
from external references in Wikipedia. We achieve a high precision
value of up to 93% in the article-entity
suggestion stage and up to 84% for the
article-section placement. Finally, we compare
our approach against competitive baselines and show significant improvements.
%0 Conference Paper
%1 fetahu2015automated
%A Fetahu, Besnik
%A Markert, Katja
%A Anand, Avishek
%B In Proceedings of the 24th ACM International Conference on Information and Knowledge Management
%C Melbourne, Australia
%D 2015
%K myown wikipedia
%T Automated News Suggestions for Populating Wikipedia Entity Pages
%U http://dblp.uni-trier.de/db/conf/cikm/cikm2015.html#FetahuMA15
%X Wikipedia entity pages are a valuable source of information for direct
consumption and for knowledge-base construction, update and
maintenance. Facts in these entity pages are typically supported by
references. Recent studies show that as much as 20% of the references are from online news sources. However, many entity pages
are incomplete even if relevant information is already available in
existing news articles. Even for the already present references,
there is often a delay between the news article publication time and
the reference time. In this work, we therefore look at Wikipedia
through the lens of news and propose a novel news-article suggestion task to improve news coverage in Wikipedia, and reduce the
lag of newsworthy references. Our work finds direct application, as
a precursor, to Wikipedia page generation and knowledge-base acceleration
tasks that rely on relevant and high quality input sources.
We propose a two-stage supervised approach for suggesting news
articles to entity pages for a given state of Wikipedia. First, we suggest news articles to Wikipedia entities (article-entity placement)
relying on a rich set of features which take into account the
salience and relative authority of entities, and the
novelty of news articles to entity pages. Second, we determine the exact section in the entity page for the input article (article-section placement) guided
by class-based section templates. We perform an extensive evaluation of our approach based on ground-truth data that is extracted
from external references in Wikipedia. We achieve a high precision
value of up to 93% in the article-entity
suggestion stage and up to 84% for the
article-section placement. Finally, we compare
our approach against competitive baselines and show significant improvements.
@inproceedings{fetahu2015automated,
abstract = {Wikipedia entity pages are a valuable source of information for direct
consumption and for knowledge-base construction, update and
maintenance. Facts in these entity pages are typically supported by
references. Recent studies show that as much as 20% of the references are from online news sources. However, many entity pages
are incomplete even if relevant information is already available in
existing news articles. Even for the already present references,
there is often a delay between the news article publication time and
the reference time. In this work, we therefore look at Wikipedia
through the lens of news and propose a novel news-article suggestion task to improve news coverage in Wikipedia, and reduce the
lag of newsworthy references. Our work finds direct application, as
a precursor, to Wikipedia page generation and knowledge-base acceleration
tasks that rely on relevant and high quality input sources.
We propose a two-stage supervised approach for suggesting news
articles to entity pages for a given state of Wikipedia. First, we suggest news articles to Wikipedia entities (article-entity placement)
relying on a rich set of features which take into account the
salience and relative authority of entities, and the
novelty of news articles to entity pages. Second, we determine the exact section in the entity page for the input article (article-section placement) guided
by class-based section templates. We perform an extensive evaluation of our approach based on ground-truth data that is extracted
from external references in Wikipedia. We achieve a high precision
value of up to 93% in the article-entity
suggestion stage and up to 84% for the
article-section placement. Finally, we compare
our approach against competitive baselines and show significant improvements.},
added-at = {2016-01-04T10:52:49.000+0100},
address = {Melbourne, Australia},
author = {Fetahu, Besnik and Markert, Katja and Anand, Avishek},
biburl = {https://www.bibsonomy.org/bibtex/2238faecd1c8f649f3fc294cf08e5444a/markert},
booktitle = {In Proceedings of the 24th ACM International Conference on Information and Knowledge Management},
interhash = {ec5789247fa4bfa052d0a4204a80b352},
intrahash = {238faecd1c8f649f3fc294cf08e5444a},
keywords = {myown wikipedia},
timestamp = {2016-01-04T10:52:49.000+0100},
title = { Automated News Suggestions for Populating Wikipedia Entity Pages},
url = {http://dblp.uni-trier.de/db/conf/cikm/cikm2015.html#FetahuMA15},
year = 2015
}