dbGetFieldsIntoDf {ctrdata}R Documentation

Create data frame of specified fields from database collection


Fields in the collection are retrieved into a data frame (or tibble). Note that fields within the record of a trial can be hierarchical and structured, that is, nested. Names of fields can be found with dbFindFields. The function uses the field names to appropriately type the values that it returns, harmonising original values (e.g. "Information not present in EudraCT" becomes 'NA', "Yes" becomes 'TRUE', "false" becomes 'FALSE', date strings become class Date, number strings become numbers). The function attempts so simplify the structure of some nested data and may concatenate multiple strings in a field using " / " (see below); for complex nested data, use function dfTrials2Long followed by dfName2Value to extract the desired nested variable(s).


dbGetFieldsIntoDf(fields = "", con, verbose = FALSE, stopifnodata = TRUE)



Vector of one or more strings, with names of sought fields. See function dbFindFields for how to find names of fields. "item.subitem" notation is supported.


A connection object, see section 'Databases' in ctrdata-package


Printing additional information if set to TRUE; (default FALSE).


Stops with an error (detaul TRUE) or with a warning (FALSE) if the sought field is empty in all, or not available in any of the records in the database collection.


A data frame (or tibble, if dplyr is loaded) with columns corresponding to the sought fields. A column for the record '_id' will always be included. Each column can be either a simple data type (numeric, character, date) or a list. For complicated lists, use function dfTrials2Long followed by function dfName2Value to extract values for nested variables. The maximum number of rows of the returned data frame is equal to, or less than the number of records of trials in the database collection.


dbc <- nodbi::src_sqlite(
   dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
   collection = "my_trials")

# get fields that are nested within another field
# and can have multiple values with the nested field
  fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor",
  con = dbc)

# fields that are lists of string values are
# returned by concatenating values with a slash
  fields = "keyword",
  con = dbc)

[Package ctrdata version 1.11.1 Index]