Inproceedings,

Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia

, , and .
Proceedings of the 24th International Conference on World Wide Web, page 1242--1252. New York, NY, USA, ACM, (2015)
DOI: 10.1145/2736277.2741666

Abstract

Hyperlinks are an essential feature of the World Wide Web. They are especially important for online encyclopedias such as Wikipedia: an article can often only be understood in the context of related articles, and hyperlinks make it easy to explore this context. But important links are often missing, and several methods have been proposed to alleviate this problem by learning a linking model based on the structure of the existing links. Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia's navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which users must find a short path from a start to a target article by only clicking links encountered along the way. We harness human navigational traces to identify a set of candidates for missing links and then rank these candidates. Experiments show that our procedure identifies missing links of high quality.

Tags

Users

  • @becker
  • @tobias.koopmann
  • @thoni
  • @dblp
  • @magnuslechner
  • @uw_ss15_web2.0

Comments and Reviews