Visualization credit GDELT Project.
Supported by Google Jigsaw, the GDELT Project monitors the world's broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organizations, counts, themes, sources, emotions, counts, quotes, images and events driving our global society every second of every day, creating a free open platform for computing on the entire world.
Visualization credit GDELT Project.
GDELT monitors the world's news media from nearly every corner of every country
in print, broadcast, and web formats, in over 100 languages,
every moment of every day.
What would it look like to use massive computing power to see the world through others' eyes, to break down language and access barriers, facilitate conversation between societies, and empower local populations with the information and insights they need to live safe and productive lives? By quantitatively codifying human society's events, dreams and fears, can we map happiness and conflict, provide insight to vulnerable populations and even potentially forecast global conflict in ways that allow us as a society to come together to deescalate tensions, counter extremism, and break down cultural barriers? That is the vision of the GDELT Project. Put simply, the GDELT Project is a realtime open data global graph over human society as seen through the eyes of the world's news media, reaching deeply into local events, reaction, discourse, and emotions of the most remote corners of the world in near-realtime and making all of this available as an open data firehose to enable research over human society.
GDELT monitors print, broadcast, and web news media in over 100 languages from across every country in the world to keep continually updated on breaking developments anywhere on the planet. Its historical archives stretch back to January 1, 1979 and update every 15 minutes. Through its ability to leverage the world's collective news media, GDELT moves beyond the focus of the Western media towards a far more global perspective on what's happening and how the world is feeling about it.
From the Global Twitter Heartbeat to the SyFy Opposite Worlds Show (and many more to be announced shortly) we are exploring how social media is used around the world and how people and societies express themselves and talk about the world online. As these projects increase our collective understanding of the social sphere and especially how it is used in the non-Western world, we will be increasingly integrating social media into GDELT's monitoring streams.
In the words of George Santayana "those who cannot remember the past are condemned to repeat it" - history is highly cyclic and contemporary events are often deeply rooted in historical contexts, making the understanding of the past of critical importance to interpreting the present. Already GDELT is the first truly multi-decade global event database and through an array of collaborations and partnerships we are expanding GDELT's coverage all the way back to the year 1800, which, when complete, will offer more than two centuries of codified global history.
Even the largest teams of human translators cannot read and translate every word published by the world's news media each day. The GDELT Translingual platform represents what we believe is the largest realtime streaming news machine translation deployment in the world: all global news that GDELT monitors in 65 languages, representing 98.4% of its daily non-English monitoring volume, is translated in realtime into English and processed.
"The GDELT Project is an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day."
Photo credit Google.
GDELT uses some of the world's most sophisticated natural language
and data mining algorithms, including the world's most powerful
deep learning algorithms, to extract more than 300 categories of events,
millions of themes and thousands of emotions and the networks that tie them together.
Monitoring nearly the entire world's news media is only the beginning - even the largest team of humans could not begin to read and analyze the billions upon billions of words and images published each day. GDELT uses some of the world's most sophisticated computer algorithms, custom-designed for global news media, running on "one of the most powerful server networks in the known Universe", together with some of the world's most powerful deep learning algorithms, to create a realtime computable record of global society that can be visualized, analyzed, modeled, examined and even forecasted. A huge array of datasets totaling trillions of datapoints are available. Three primary data streams are created, one codifying physical activities around the world in over 300 categories, one recording the people, places, organizations, millions of themes and thousands of emotions underlying those events and their interconnections and one codifying the visual narratives of the world's news imagery.
All three streams update every 15 minutes, offering near-realtime insights into the world around us. Underlying the streams are a vast array of sources, from hundreds of thousands of global media outlets to special collections like 215 years of digitized books, 21 billion words of academic literature spanning 70 years, human rights archives and even saturation processing of the raw closed captioning stream of almost 100 television stations across the US in collaboration with the Internet Archive's Television News Archive. Finally, also in collaboration with the Internet Archive, the Archive captures nearly all worldwide online news coverage monitored by GDELT each day into its permanent archive to ensure its availability for future generations even in the face of repressive forces that continue to erode press freedoms around the world.
The GDELT Event Database records over 300 categories of physical activities around the world, from riots and protests to peace appeals and diplomatic exchanges, georeferenced to the city or mountaintop, across the entire planet dating back to January 1, 1979 and updated every 15 minutes.
Essentially it takes a sentence like "The United States criticized Russia yesterday for deploying its troops in Crimea, in which a recent clash with its soldiers left 10 civilians injured" and transforms this blurb of unstructured text into three structured database entries, recording US CRITICIZES RUSSIA, RUSSIA TROOP-DEPLOY UKRAINE (CRIMEA), and RUSSIA MATERIAL-CONFLICT CIVILIANS (CRIMEA).
Nearly 60 attributes are captured for each event, including the approximate location of the action and those involved. This translates the textual descriptions of world events captured in the news media into codified entries in a grand "global spreadsheet."
Much of the true insight captured in the world's news media lies not in what it says, but the context of how it says it. The GDELT Global Knowledge Graph (GKG) compiles a list of every person, organization, company, location and several million themes and thousands of emotions from every news report, using some of the most sophisticated named entity and geocoding algorithms in existance, designed specifically for the noisy and ungrammatical world that is the world's news media.
The resulting network diagram constructs a graph over the entire world, encoding not only what's happening, but what its context is, who's involved, and how the world is feeling about it, updated every single day.
Worldwide news reporting is increasingly saturated by imagery, but historically GDELT has been limited to the textual contents of global journalism. As of January 2016, a random sample of up to a million images a day are drawn from the media of almost every country and processed through Google's Vision API.
Each image is annotated with the objects and activities it depicts, transcriptions of recognizable text (accurate enough to capture a handwritten Arabic protest sign held at an angle), the geographic location inferred from visual context, recognizable logos, and even the emotion of each human face. All of these annotations are delivered as an open data firehose quantifying the visual narratives of the world's media.
In addition to the news-based live Global Knowledge Graph, there numerous special GKG collections available that focus on specific specialized sources of information or topics.
Collections currently available include 215 years of books comprising the majority of English language volumes digitized from US libraries, more than half a century of the output of the world's major human rights organizations, saturation processing of the closed captioning of more than 100 US television stations, and a special socio-cultural academic literature archive totaling 21 billion words spanning 70 years and more than 2,200 journals.
"GDELT is designed to help support new theories and descriptive understandings of the behaviors and driving forces of global-scale social systems from the micro-level of the individual through the macro-level of the entire planet by offering realtime synthesis of global societal-scale behavior into a rich quantitative database allowing realtime monitoring and analytical exploration of those trends."
Photo credit Georgetown University.
The entire GDELT database is 100% free and open and you can
download the raw datafiles, visualize it using the
GDELT Analysis Service, or analyze it at limitless scale with Google BigQuery.
The GDELT Analysis Service is a free cloud-based service that offers a variety of tools and services to allow you to visualize, explore, and export both the GDELT Event Database and the GDELT Global Knowledge Graph. This is a great way to get started exploring GDELT and what it can do for you, even if you don't have a technical background.
The entire quarter-billion-record GDELT Event Database is available in Google BigQuery, updated daily. You can query, export, and even conduct sophisticated analyses and modeling of the entire dataset using standard SQL, with even the most complex queries returning in near-realtime.
Advanced users and those with unique use cases can download all of the underlying records in CSV format. A single year of the GDELT GKG totals 2.5TB and few software packages can deal with even small subsets of the database, so most users will likely wish to use the GDELT Analysis Service or Google BigQuery.
"GDELT's evolving ability to capture ethnic, religious, and other social and cultural relationships will offer profoundly new insights into the interplay of group behavior over time, offering a rich new platform for understanding patterns of social evolution, while GDELT's realtime nature will expand current understanding of social systems beyond static snapshots towards theories that incorporate the nonlinear behavior and feedback effects that define human interaction and greatly enrich fragility indexes, early warning systems, and forecasting efforts."
Visualization credit GDELT Project.
The GDELT Blog is the official one-stop repository for the
latest news, announcements, information, and applications
of the world's largest open research platform on human society.
The Official GDELT Project Blog is the best place to keep track of all of the latest news, announcements, information, developments, latest features and releases, and new applications and media coverage of the GDELT Project. Its basically GDELT's home in the blogosphere!
The blog is also where we feature a steady stream of examples showcasing projects that use GDELT in new and innovative ways as well as "getting started" examples that show you how to perform basic analyses using GDELT, from mapping to modeling. Want to see an example of how to apply one of the tools in the GDELT Analysis Service? The blog offers examples of how to use each tool and the kinds of analyses they support.
Do you have a cool new application or visualization you've built using GDELT? Writing a paper or analysis using GDELT? Hosting a hackathon using it? Have an awesome new analysis or visualization software package you used GDELT to show off? Write a great story about GDELT or using GDELT? Drop us an email and we'd love to feature it on our blog!
"A global societal observatory for research on global society. Mapping the People, Organizations, Themes, Emotions, and Events Driving Global Events."