Article,

GDELT: Global data on events, location, and tone

, and .
ISA Annual Convention, (2013)

Abstract

GDELT—Global Data on Events, Location and Tone—is a new CAMEO-coded data set containing more than 200-million geolocated events with global coverage for 1979 to the present. The data are based on news reports from a variety of international news sources coded using the Tabari system for events and additional software for location and tone. The data is freely available and we expect to provide daily updates. This paper describes the news sources and some of their characteristics, the various pro- cessing steps that are used in generating the data, some comparisons with the KEDS Levants/Reuters and ICEWS/Asia data sets, and some visualizations. We conclude with an outline of planned enhancements to the data in the near future: these include recoding with new WordNet-enhanced dictionaries, the extension of the CAMEO cod- ing to incorporate codes for financial events, disease outbreaks and natural disasters, and the development of an open-source Python-based successor to Tabari which will use parsed input from existing natural language processing tools.

Tags

Users

  • @asmelash

Comments and Reviews