GetGDELT {GDELTtools} | R Documentation |
Download and subset GDELT V1 event data
Description
Download the GDELT V1 Event files necessary for a data set, import them, filter on various criteria, and return a data.frame.
Usage
GetGDELT(
start_date,
end_date = start_date,
row_filter,
...,
local_folder = tempdir(),
max_local_mb = Inf,
data_url_root = "http://data.gdeltproject.org/events/",
verbose = TRUE
)
Arguments
start_date |
character, earliest date to include in "YYYY-MM-DD" format. |
end_date |
character, latest date to include in "YYYY-MM-DD" format. |
row_filter |
<data-masking> Row selection. Expressions that return a logical value, and are defined in terms of the variables in GDELT. If multiple expressions are included, they are combined with the & operator. Only rows for which all conditions evaluate to TRUE are kept. |
... |
<tidy-select>, Column selection. This takes the form of one or more unquoted expressions separated by commas. Variable names can be used as if they were positions in the data frame, so expressions like x:y can be used to select a range of variables. |
local_folder |
character, if specified, where downloaded files will be saved. |
max_local_mb |
numeric, the maximum size in MB of the downloaded files that will be retained. |
data_url_root |
character, URL for the folder with GDELT data files. |
verbose |
logical, if TRUE then indications of progress will be displayed_ |
Details
Dates are parsed with guess_datetime
in the datetimeutils package.
The recommended format is "YYYY-MM-DD".
If local_folder
is not specified then downloaded files are stored in
tempdir()
. If a needed file has already been downloaded to local_folder
then this file is used instead of being downloaded. This can greatly speed up future
downloads.
Value
data.frame
Filtering Results
The row_filter
is passed to filter
. This is a very flexible way to filter
the rows. It's well worth checking out the filter
documentation.
Selecting Columns
The ...
is passed to select
. This is a very flexible way to choose
which columns to return. It's well worth checking out the select
documentation.
Author(s)
Stephen R. Haptonstahl | srh@haptonstahl.org |
Thomas Scherer | tscherer@princeton.edu |
John Beieler | jub270@psu.edu |
References
GDELT: Global Data on Events, Location and Tone, 1979-2013. Presented at the 2013 meeting of the International Studies Association in San Francisco, CA. https://www.gdeltproject.org/
Examples
## Not run:
df1 <- GetGDELT(start_date="1979-01-01", end_date="1979-12-31")
df2 <- GetGDELT(start_date="1979-01-01", end_date="1979-12-31",
row_filter=ActionGeo_CountryCode=="US")
df3 <- GetGDELT(start_date="1979-01-01", end_date="1979-12-31",
row_filter=Actor2Geo_CountryCode=="RS" & NumArticles==2 & is.na(Actor1CountryCode),
1:5)
df4 <- GetGDELT(start_date="1979-01-01", end_date="1979-12-31",
row_filter=Actor2Code=="COP" | Actor2Code=="MED",
contains("date"), starts_with("actor"))
# Specify a local folder to store the downloaded files
df5 <- GetGDELT(start_date="1979-01-01", end_date="1979-12-31",
row_filter=ActionGeo_CountryCode=="US",
local_folder = "~/gdeltdata")
## End(Not run)