The Global Database of Events, Language and Tone (GDELT) Project monitors the world's broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organizations, counts, themes, sources, emotions, counts, quotes, images and events driving our global society every second of every day.

The GDELT 1.0 Event Database contains over a quarter-billion records organized into a set of tab-delimited files by date. More information about the project can be found at the the GDELT Project website.

The dataset covers events from 1979 to the present. Beginning with April 1, 2013, files are created daily and records are stored by the date the event was found in the world's news media rather than the date it occurred (97%+ of events are reported within 24 hours of happening, but a small number of events each day are past events being mentioned for the first time - if an event has been seen before it will not be included again). Files provided in tab-delimited format, but named with a ".csv" extension to address some software packages that will not accept .txt or .tsv files.

For years between 2006 and April 1, 2013, records are stored in monthly and yearly files by the date the event took place. For years before 2006, records are stored in yearly files.

The GDELT 1.0 Event Database .csv (tab-delimited) files are located in the "events" prefix of the gdelt-open-data bucket. There are two different schemas used. One for files dated 1979 – March 2013 and one for files dated April, 1, 2013 to Present. Different time buckets have different naming formats illustrated below.

 

aws s3 cp s3://gdelt-open-data/events/1979.csv .

aws s3 cp s3://gdelt-open-data/events/201005.csv .

aws s3 cp s3://gdelt-open-data/events/20130530.export.csv .

To view all files available in the gdelt-open-data bucket, you can use the following command:

aws s3 ls gdelt-open-data/events/

The following ARN is for an Amazon SNS topic that publishes an Amazon S3 event message whenever a new CSV file has been added to gdelt-open-data bucket:

arn:aws:sns:us-east-1:928094251383:gdelt-csv

Note that this SNS topic will only allow subscriptions via Amazon SQS or AWS Lambda.

Learn more about subscribing to SNS topics.

Source
GDELT Project
Category Event Database, Geospatial
Format CSV (tab-delimited)
License Available for unlimited and unrestricted use for any academic, commercial, or governmental use of any kind without fee. Terms of Use
Storage Service Amazon S3
Location s3://gdelt-open-data in US East Region
Update Frequency Daily

If you would like to show us what you can do with GDELT on AWS or would like to receive updates on the project, please fill out the form below.

Educators, researchers and students can also apply for free credits to take advantage of the utility computing platform offered by AWS, along with Public Datasets such as GDELT on AWS. If you have a research project that could take advantage of GDELT on AWS, you can apply for AWS Cloud Credits for Research.