get_eurostat_raw {restatapi} | R Documentation |
Get Eurostat data as it is
Description
Download data sets from Eurostat database .
Usage
get_eurostat_raw(
id,
mode = "txt",
cache = TRUE,
update_cache = FALSE,
cache_dir = NULL,
compress_file = TRUE,
stringsAsFactors = FALSE,
keep_flags = FALSE,
check_toc = FALSE,
melt = TRUE,
verbose = FALSE,
...
)
Arguments
id |
A code name for the dataset of interest.
See |
mode |
defines the format of the downloaded dataset. It can be |
cache |
a logical whether to do caching. Default is |
update_cache |
a logical with a default value |
cache_dir |
a path to a cache directory. The |
compress_file |
a logical whether to compress the
RDS-file in caching. Default is |
stringsAsFactors |
if |
keep_flags |
a logical whether the observation status (flags) - e.g. "confidential",
"provisional", etc. - should be kept in a separate column or if they
can be removed. Default is |
check_toc |
a boolean whether to check the provided |
melt |
a boolean with default value |
verbose |
A boolean with default |
... |
further argument for the |
Details
Data sets are downloaded from the Eurostat bulk download facility in CSV, TSV or SDMX format.
The id
, should be a value from the code
column of the table of contents (get_eurostat_toc
), and can be searched for with the search_eurostat_toc
function. The id value can be retrieved from the Eurostat database
as well. The Eurostat database gives codes in the Data Navigation Tree after every dataset in parenthesis.
By default all datasets downloaded in TSV format and cached as they are often rather large.
The datasets cached in memory (default) or can be stored in a temporary directory if cache_dir
or option(restatpi_cache_dir)
is defined.
The cache can be emptied with clean_restatapi_cache
.
If the id
is checked in TOC then the data will saved in the cache with the date from the "lastUpdate" column from the TOC, otherwise it is saved with the current date.
Value
a data.table with the following columns if the default melt=TRUE
is used:
FREQ | The frequency of the data (Annual, Semi-annual, Half-year, Quarterly, Monthly, Weekly, Daily) |
dimension names | One column for each dimension in the data |
TIME_FORMAT | A column for the time format, if the source file SDMX-ML and the data was not loaded from a previously cached TSV download (this column is missing if the source file is TSV) |
time/TIME_PERIOD | A column for the time dimension, where the name of the column depends on the source file (TSV/SDMX-ML) |
values/OBS_VALUE | A column for numerical values, where the name of the column depends on the source file (TSV/SDMX-ML) |
flags/OBS_STATUS | A column for flags if the keep_flags=TRUE otherwise this column is not included
in the data table, and the name of the column depends on the source file (TSV/SDMX-ML)
|
The data does not include all missing values. The missing values are dropped if the value and flags are missing on a particular time.
In case melt=FALSE
the results is a data.table where the first column contains the comma separated values of the various dimensions, and the columns contains the observations for each time dimension.
See Also
get_eurostat_data
, get_eurostat_bulk
Examples
if (!(grepl("amzn|-aws|-azure ",Sys.info()['release']))) options(timeout=2)
head(get_eurostat_raw("agr_r_milkpr",keep_flags=TRUE))
head(get_eurostat_raw("avia_par_ee",mode="xml",check_toc=TRUE,update_cache=TRUE,verbose=TRUE))
options(restatapi_update=FALSE)
head(get_eurostat_raw("avia_par_me",mode="txt",melt=FALSE))
head(get_eurostat_raw("avia_par_me",
mode="txt",
cache_dir=tempdir(),
compress_file=FALSE,
verbose=TRUE))
options(timeout=60)