redcap_read {REDCapR} | R Documentation |
Read records from a REDCap project in subsets, and stacks them together before returning a dataset
Description
From an external perspective, this function is similar to
redcap_read_oneshot()
. The internals differ in that redcap_read
retrieves subsets of the data, and then combines them before returning
(among other objects) a single base::data.frame()
. This function can
be more appropriate than redcap_read_oneshot()
when returning large
datasets that could tie up the server.
Usage
redcap_read(
batch_size = 100L,
interbatch_delay = 0.5,
continue_on_error = FALSE,
redcap_uri,
token,
records = NULL,
records_collapsed = "",
fields = NULL,
fields_collapsed = "",
forms = NULL,
forms_collapsed = "",
events = NULL,
events_collapsed = "",
raw_or_label = "raw",
raw_or_label_headers = "raw",
export_checkbox_label = FALSE,
export_survey_fields = FALSE,
export_data_access_groups = FALSE,
filter_logic = "",
datetime_range_begin = as.POSIXct(NA),
datetime_range_end = as.POSIXct(NA),
col_types = NULL,
guess_type = TRUE,
guess_max = NULL,
http_response_encoding = "UTF-8",
locale = readr::default_locale(),
verbose = TRUE,
config_options = NULL,
id_position = 1L
)
Arguments
batch_size |
The maximum number of subject records a single batch should contain. The default is 100. |
interbatch_delay |
The number of seconds the function will wait before requesting a new subset from REDCap. The default is 0.5 seconds. |
continue_on_error |
If an error occurs while reading, should records
in subsequent batches be attempted. The default is |
redcap_uri |
The URI (uniform resource identifier) of the REDCap project. Required. |
token |
The user-specific string that serves as the password for a project. Required. |
records |
An array, where each element corresponds to the ID of a desired record. Optional. |
records_collapsed |
A single string, where the desired ID values are separated by commas. Optional. |
fields |
An array, where each element corresponds to a desired project field. Optional. |
fields_collapsed |
A single string, where the desired field names are separated by commas. Optional. |
forms |
An array, where each element corresponds to a desired project form. Optional. |
forms_collapsed |
A single string, where the desired form names are separated by commas. Optional. |
events |
An array, where each element corresponds to a desired project event. Optional. |
events_collapsed |
A single string, where the desired event names are separated by commas. Optional. |
raw_or_label |
A string (either |
raw_or_label_headers |
A string (either |
export_checkbox_label |
specifies the format of checkbox field values
specifically when exporting the data as labels. If |
export_survey_fields |
A boolean that specifies whether to export the survey identifier field (e.g., 'redcap_survey_identifier') or survey timestamp fields (e.g., instrument+'_timestamp'). The timestamp outputs reflect the survey's completion time (according to the time and timezone of the REDCap server.) |
export_data_access_groups |
A boolean value that specifies whether or
not to export the |
filter_logic |
String of logic text (e.g., |
datetime_range_begin |
To return only records that have been created or modified after a given datetime, provide a POSIXct value. If not specified, REDCap will assume no begin time. |
datetime_range_end |
To return only records that have been created or modified before a given datetime, provide a POSIXct value. If not specified, REDCap will assume no end time. |
col_types |
A |
guess_type |
A boolean value indicating if all columns should be
returned as character. If true, |
guess_max |
Deprecated. |
http_response_encoding |
The encoding value passed to
|
locale |
a |
verbose |
A boolean value indicating if |
config_options |
A list of options to pass to |
id_position |
The column position of the variable that unique
identifies the subject (typically |
Details
redcap_read()
internally uses multiple calls to redcap_read_oneshot()
to select and return data. Initially, only the primary key is queried
through the REDCap API. The long list is then subsetted into batches,
whose sizes are determined by the batch_size
parameter. REDCap is then
queried for all variables of the subset's subjects. This is repeated for
each subset, before returning a unified base::data.frame()
.
The function allows a delay between calls, which allows the server to attend to other users' requests (such as the users entering data in a browser). In other words, a delay between batches does not bog down the webserver when exporting/importing a large dataset.
A second benefit is less RAM is required on the webserver. Because each batch is smaller than the entire dataset, the webserver tackles more manageably sized objects in memory. Consider batching if you encounter the error:
ERROR: REDCap ran out of server memory. The request cannot be processed. Please try importing/exporting a smaller amount of data.
For redcap_read()
to function properly, the user must have Export
permissions for the 'Full Data Set'. Users with only 'De-Identified'
export privileges can still use redcap_read_oneshot
. To grant the
appropriate permissions:
go to 'User Rights' in the REDCap project site,
select the desired user, and then select 'Edit User Privileges',
in the 'Data Exports' radio buttons, select 'Full Data Set'.
Value
Currently, a list is returned with the following elements:
-
data
: An Rbase::data.frame()
of the desired records and columns. -
success
: A boolean value indicating if the operation was apparently successful. -
status_codes
: A collection of http status codes, separated by semicolons. There is one code for each batch attempted. -
outcome_messages
: A collection of human readable strings indicating the operations' semicolons. There is one code for each batch attempted. In an unsuccessful operation, it should contain diagnostic information. -
records_collapsed
: The desired records IDs, collapsed into a single string, separated by commas. -
fields_collapsed
: The desired field names, collapsed into a single string, separated by commas. -
filter_logic
: The filter statement passed as an argument. -
elapsed_seconds
: The duration of the function.
Author(s)
Will Beasley
References
The official documentation can be found on the 'API Help Page' and 'API Examples' pages on the REDCap wiki (i.e., https://community.projectredcap.org/articles/456/api-documentation.html and https://community.projectredcap.org/articles/462/api-examples.html). If you do not have an account for the wiki, please ask your campus REDCap administrator to send you the static material.
Examples
## Not run:
uri <- "https://bbmc.ouhsc.edu/redcap/api/"
token <- "9A81268476645C4E5F03428B8AC3AA7B"
REDCapR::redcap_read(batch_size=2, redcap_uri=uri, token=token)$data
# Specify the column types.
col_types <- readr::cols(
record_id = readr::col_integer(),
race___1 = readr::col_logical(),
race___2 = readr::col_logical(),
race___3 = readr::col_logical(),
race___4 = readr::col_logical(),
race___5 = readr::col_logical(),
race___6 = readr::col_logical()
)
REDCapR::redcap_read(
redcap_uri = uri,
token = token,
col_types = col_types,
batch_size = 2
)$data
## End(Not run)