R: Download a survey and import it into R

fetch_survey {qualtRics}

R Documentation

Download a survey and import it into R

Description

Download a Qualtrics survey you own via API and import the survey directly into R.

Usage

fetch_survey(
  surveyID,
  limit = NULL,
  start_date = NULL,
  end_date = NULL,
  time_zone = NULL,
  include_display_order = TRUE,
  include_metadata = NULL,
  include_questions = NULL,
  include_embedded = NULL,
  unanswer_recode = NULL,
  unanswer_recode_multi = unanswer_recode,
  breakout_sets = TRUE,
  import_id = FALSE,
  label = TRUE,
  convert = TRUE,
  add_column_map = TRUE,
  add_var_labels = TRUE,
  strip_html = TRUE,
  col_types = NULL,
  verbose = TRUE,
  tmp_dir = tempdir(),
  last_response = deprecated(),
  force_request = deprecated(),
  save_dir = deprecated()
)

Arguments

`surveyID`	String. Unique ID for the survey you want to download. Returned as `id` by the all_surveys function.
`limit`	Integer. Maximum number of responses exported. Defaults to `NULL` (download all responses).
`start_date`, `end_date`	POSIXct, POSIXlt, or Date object, or length-1 string equivalent of form "YYYY-MM-DD" or "YYYY-MM-DD HH:MM:SS". ("/" is also acceptable in place of "-".) Only export survey responses that were recorded within the range specified by one or both arguments (i.e. referencing RecordedDate). Each defaults to `NULL` (unbounded). See Details for important information about both the package and Qualtrics' handling of start/end times.
`time_zone`	String. Time zone to use for date/time metadata variables in response dataframe (e.g. StartDate). Must match a time zone name from `base::OlsonNames()`. Defaults to `NULL`, which uses the current system timezone (from `base::Sys.timezone()`). Also applied to arguments `start_date` and/or `expiration_date` when given Date or string objects (see above); ignored when these arguments are given POSIXlt/POSIXct objects.
`include_display_order`	Logical. If `TRUE`, download from surveys using block/question/answer display randomization will include contain additional variables indicating the randomization pattern used for each case. Defaults to `FALSE`.
`include_metadata`, `include_questions`, `include_embedded`	Character vector. Specify variables to include in download. Defaults to `NULL` (keep all). `NA` or `character()` excludes all variables for that category. See Details for more on using each inclusion argument.
`unanswer_recode`	Integer-like. Recode seen-but-unanswered (usually skipped) questions using this value. Defaults to `NA`
`unanswer_recode_multi`	Integer-like. Recode seen-but-unanswered multi-select questions (checkboxes) using this value. Defaults to value for `unaswer_recode`.
`breakout_sets`	Logical. If `TRUE` multi-value fields (e.g. each option of a multi-select multiple choice questions) will be returned as separate columns. If `FALSE`, will be returned as 1 column with each element containing all values.
`import_id`	Logical. If `TRUE`, column names will use Qualtrics import IDs (e.g. "QID123") instead of user-modifiable names (e.g. default names like "Q3" or custom names). Defaults to `FALSE` (user-modifiable names). Note that this also affects (otherwise unmodifiable) names of metadata columns–see the "`include_metadata`" section in Details below.
`label`	Logical. If `TRUE` (default), will return text of answer choices, instead of recoded values (`FALSE`).
`convert`	Logical. If `TRUE`, then the `fetch_survey()` function will convert certain question types (e.g. multiple choice) to proper data type in R. Defaults to `TRUE`.
`add_column_map`	Logical. Add an attribute to the returned response data frame containing metadata associated with the response download, including variable names, question/choice text, and Qualtrics import IDs. This column map can be subsequently obtained using `extract_colmap()` Defaults to `TRUE`.
`add_var_labels`	Logical. If `TRUE`, then the item description from each variable (equivalent to the one in the column map) will be added as a "label" attribute using `sjlabelled::set_label()`. Useful for reference as well as cross-compatibility with other stats packages (e.g., Stata, see documentation in `sjlabelled`). Defaults to `TRUE`.
`strip_html`	Logical. If `TRUE`, then remove HTML tags from variable descriptions. Defaults to `TRUE`. Ignored if `add_column_map` and `add_var_labels` are both `FALSE`.
`col_types`	Optional. This argument provides a way to manually overwrite column types that may be incorrectly guessed. Takes a `readr::cols()` specification. See example below and `readr::cols()` for formatting details. Defaults to `NULL`. Overwritten by `convert = TRUE`.
`verbose`	Logical. If `TRUE`, verbose messages will be printed to the R console. Defaults to `TRUE`.
`tmp_dir`	Path to filesystem directory. Qualtrics returns response data in compressed (zip) form. To extract raw data, the zip file must be briefly written to disk (the file is then promptly deleted). By default, the system's temporary directory is used for this (see `tempdir()`), but users needing more control can specify an alternate location here.
`last_response`	Deprecated.
`force_request`	Deprecated.
`save_dir`	Deprecated.

Details

If the request to the Qualtrics API made by this function fails, the request will be retried. If you see these failures on a 500 error (such as a 504 error) be patient while the request is retried; it will typically succeed on retrying. If you see other types of errors, retrying is unlikely to help.

`start_date` & `end_date` arguments

The Qualtrics API endpoint for this function treats start_date and end_date slightly differently; end_date is exclusive, meaning only responses recorded up to the moment before the specified end_date will be returned. This permits easier automation; a previously-used end_date can become the start_date of a subsequent request without downloading duplicate records.

As a convenience for users working interactively, the qualtRics package also accepts Date(-like) input to each argument, which when used implies a time of 00:00:00 on the given date (and time zone). When a Date(-like) is passed to end_date, however, the date will be incremented by one before making the API request. This adjustment is intended to provide interactive users with more intuitive results; for example, specifying "2022/06/02" for both start_date and end_date will return all responses for that day, (instead of the zero responses that would return if end_date was not adjusted).

Inclusion/exclusion arguments

The three ⁠include_*⁠ arguments each have different requirements:

`include_metadata`

Elements must be one of the 17 Qualtrics metadata variables, listed here in their default order: StartDate (startDate), EndDate (endDate), Status (status), IPAddress (ipAddress), Progress (progress), Duration (in seconds) (duration), Finished (finished), RecordedDate (recordedDate), ResponseId (_recordId), RecipientLastName (recipientLastName), RecipientFirstName (recipientFirstName), RecipientEmail (recipientEmail), ExternalReference (externalDataReference), LocationLatitude (locationLatitude), LocationLongitude (locationLongitude), DistributionChannel (distributionChannel), UserLanguage (userLanguage).

Names in parentheses are those returned by the API endpoint when import_id is set to TRUE. The argument include_metadata can accept either format regardless of import_id setting, and names are not case-sensitive. Duplicate elements passed to include_metadata will be silently dropped, with the de-duplicated variable located in the first position.

`include_questions`

Qualtrics uniquely identifies each question with an internal ID that takes the form "QID" followed by a number, e.g. QID5. When using include_questions, these internal IDs must be used rather than user-customizable variable names (which need not be unique in Qualtrics). If needed, a column map linking customizable names to QID's can be quickly obtained by calling:

my_survey <- fetch_survey(
    surveyID = {survey ID},
    limit = 1,
    add_column_map = TRUE
)
extract_colmap(my_survey)

Note that while there is one QID for each "question" in the Qualtrics sense, each QID may still map to multiple columns in the returned data frame. If, for example, a "question" with ID QID5 is a multiple-choice item with a text box added to the third choice, the returned data frame may have two related columns: "QID5" for the multiple choice selection, and "QID5_3_TEXT" for the text box (or, more typically, their custom names). Setting include_questions = "QID5" will always return both columns. Similarly, "matrix" style multiple-choice questions will have a column for each separate row of the matrix. Also, when include_display_order = TRUE, display ordering variables for any randomization will be included. Currently, separating these sub-questions via the API does not appear possible (e.g., include_questions = "QID5_3_TEXT" will result in an API error).

`include_embedded`

This argument accepts the user-specified names of any embedded data variables in the survey being accessed.

Examples

## Not run: 
# Register your Qualtrics credentials if you haven't already
qualtrics_api_credentials(
  api_key = "<YOUR-API-KEY>",
  base_url = "<YOUR-BASE-URL>"
)

# Retrieve a list of surveys
surveys <- all_surveys()

# Retrieve a single survey
my_survey <- fetch_survey(surveyID = surveys$id[6])

my_survey <- fetch_survey(
  surveyID = surveys$id[6],
  start_date = "2018-01-01",
  end_date = "2018-01-31",
  limit = 100,
  label = TRUE,
  unanswer_recode = 999,
  verbose = TRUE,
  # Manually override EndDate to be a character vector
  col_types = readr::cols(EndDate = readr::col_character())
)


## End(Not run)

[Package qualtRics version 3.2.0 Index]

Download a survey and import it into R

Description

Usage

Arguments

Details

start_date & end_date arguments

Inclusion/exclusion arguments

include_metadata

include_questions

include_embedded

See Also

Examples

`start_date` & `end_date` arguments

`include_metadata`

`include_questions`

`include_embedded`