@ntempelmeier

Geotagging Named Entities in News and Online Documents

, and . Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, page 1321--1330. New York, NY, USA, ACM, (2016)
DOI: 10.1145/2983323.2983795

Abstract

News sources generate constant streams of text with many references to real world entities; understanding the content from such sources often requires effectively detecting the geographic foci of the entities. We study the problem of associating geography to named entities in online documents. More specifically, given a named entity and a page (or a set of pages) where the entity is mentioned, the problem being studied is how the geographic focus of the name can be resolved at a location granularity (e.g. city or country), assuming that the name has a geographic focus. We further study dispersion, and show that the dispersion of a name can be estimated with a good accuracy, allowing a geo-centre to be detected at an exact dispersion level. Two key features of our approach are: (i) minimal assumption is made on the structure of the mentions hence the approach can be applied to a diverse and heterogeneous set of web pages, and (ii) the approach is unsupervised, leveraging shallow English linguistic features and the large volume of location data in public domain. We evaluate our methods under different task settings and with different categories of named entities. Our evaluation reveals that the geo-centre of a name can be estimated with a good accuracy based on some simple statistics of the mentions, and that the accuracy of the estimation varies with the categories of the names.

Description

Geotagging Named Entities in News and Online Documents

Links and resources

Tags

community

  • @ntempelmeier
  • @dblp
@ntempelmeier's tags highlighted