get_file {dataverse} | R Documentation |
Download Dataverse file as a raw binary
Description
Download Dataverse File(s). get_file_*
functions return a raw binary file, which cannot be readily analyzed in R.
To use the objects as dataframes, see the get_dataframe_*
functions at
?get_dataframe
instead.
Usage
get_file(
file,
dataset = NULL,
format = c("original", "bundle"),
vars = NULL,
return_url = FALSE,
key = Sys.getenv("DATAVERSE_KEY"),
server = Sys.getenv("DATAVERSE_SERVER"),
original = TRUE,
...
)
get_file_by_name(
filename,
dataset,
format = c("original", "bundle"),
vars = NULL,
return_url = FALSE,
key = Sys.getenv("DATAVERSE_KEY"),
server = Sys.getenv("DATAVERSE_SERVER"),
original = TRUE,
...
)
get_file_by_id(
fileid,
dataset = NULL,
format = c("original", "bundle"),
vars = NULL,
original = TRUE,
progress = NULL,
return_url = FALSE,
key = Sys.getenv("DATAVERSE_KEY"),
server = Sys.getenv("DATAVERSE_SERVER"),
...
)
get_file_by_doi(
filedoi,
dataset = NULL,
format = c("original", "bundle"),
vars = NULL,
original = TRUE,
return_url = FALSE,
key = Sys.getenv("DATAVERSE_KEY"),
server = Sys.getenv("DATAVERSE_SERVER"),
...
)
Arguments
file |
An integer specifying a file identifier; or a vector of integers
specifying file identifiers; or, if used with the prefix |
dataset |
A character specifying a persistent identification ID for a dataset,
for example |
format |
A character string specifying a file format for download.
by default, this is “original” (the original file format). If |
vars |
A character vector specifying one or more variable names, used to extract a subset of the data. |
return_url |
Instead of downloading the file, return the URL for download.
Defaults to |
key |
A character string specifying a Dataverse server API key. If one
is not specified, functions calling authenticated API endpoints will fail.
Keys can be specified atomically or globally using
|
server |
A character string specifying a Dataverse server.
Multiple Dataverse installations exist, with |
original |
A logical, defaulting to TRUE. If a ingested (.tab) version is
available, download the original version instead of the ingested? If there was
no ingested version, is set to NA. Note in |
... |
Additional arguments passed to an HTTP request function, such as
|
filename |
Filename of the dataset, with file extension as shown in Dataverse (for example, if nlsw88.dta was the original but is displayed as the ingested nlsw88.tab, use the ingested version.) |
fileid |
A numeric ID internally used for |
progress |
Whether to show a progress bar of the download.
If not specified, will be set to |
filedoi |
A DOI for a single file (not the entire dataset), of the form
|
Details
This function provides access to data files from a Dataverse entry.
get_file
is a general wrapper,
and can take either dataverse objects, file IDs, or a filename and dataverse.
Internally, all functions download each file by get_file_by_id
.
get_file_by_name
is a shorthand for running get_file
by
specifying a file name (filename
) and dataset (dataset
).
get_file_by_doi
obtains a file by its file DOI, bypassing the
dataset
argument.
Value
get_file
returns a raw vector (or list of raw vectors,
if length(file) > 1
), which can be saved locally with the writeBin
function. To load datasets into the R environment dataframe, see
get_dataframe_by_name.
See Also
To load the objects as datasets get_dataframe_by_name.
Examples
## Not run:
# 1. Using filename and dataverse
f1 <- get_file_by_name(
filename = "nlsw88.tab",
dataset = "10.70122/FK2/PPIAXE",
server = "demo.dataverse.org"
)
# 2. Using file DOI
f2 <- get_file_by_doi(
filedoi = "10.70122/FK2/PPIAXE/MHDB0O",
server = "demo.dataverse.org"
)
# 3. Two-steps: Find ID from get_dataset
d3 <- get_dataset("doi:10.70122/FK2/PPIAXE", server = "demo.dataverse.org")
f3 <- get_file(d3$files$id[1], server = "demo.dataverse.org")
# 4. Retrieve multiple raw data in list
f4_meta <- get_dataset(
"doi:10.70122/FK2/PPIAXE",
server = "demo.dataverse.org"
)
f4 <- get_file(f4_meta$files$id, server = "demo.dataverse.org")
names(f4) <- f4_meta$files$label
# Write binary files. To load into R environment, use get_dataframe_by_name()
# The appropriate file extension needs to be assigned by the user.
writeBin(f1, "nlsw88.dta") # .tab extension but save as dta
writeBin(f4[["nlsw88_rds-export.rds"]], "nlsw88.rds") # originally a rds file
writeBin(f4[["nlsw88.tab"]], "nlsw88.dta") # originally a dta file
## End(Not run)