Abstract
This paper presents a novel analysis and visualization of English Wikipedia
data. Our specific interest is the analysis of basic statistics, the
identification of the semantic structure and age of the categories in this free
online encyclopedia, and the content coverage of its highly productive authors.
The paper starts with an introduction of Wikipedia and a review of related
work. We then introduce a suite of measures and approaches to analyze and map
the semantic structure of Wikipedia. The results show that co-occurrences of
categories within individual articles have a power-law distribution, and when
mapped reveal the nicely clustered semantic structure of Wikipedia. The results
also reveal the content coverage of the article's authors, although the roles
these authors play are as varied as the authors themselves. We conclude with a
discussion of major results and planned future work.
Users
Please
log in to take part in the discussion (add own reviews or comments).