prepare_data {daiquiri}R Documentation

Prepare source data

Description

Validate a data frame against a field_types() specification, and prepare for aggregation.

Usage

prepare_data(
  df,
  field_types,
  override_column_names = FALSE,
  na = c("", "NA", "NULL"),
  dataset_description = NULL,
  show_progress = TRUE
)

Arguments

df

A data frame

field_types

field_types() object specifying names and types of fields (columns) in the supplied df. See also field_types_available.

override_column_names

If FALSE, column names in the supplied df must match the names specified in field_types exactly. If TRUE, column names in the supplied df will be replaced with the names specified in field_types. The specification must therefore contain the columns in the correct order. Default = FALSE

na

vector containing strings that should be interpreted as missing values. Default = c("","NA","NULL"). Additional column-specific values can be specified in the field_types() object

dataset_description

Short description of the dataset being checked. This will appear on the report. If blank, the name of the data frame object will be used

show_progress

Print progress to console. Default = TRUE

Value

A daiquiri_source_data object

See Also

field_types(), field_types_available(), aggregate_data(), report_data(), daiquiri_report()

Examples

# load example data into a data.frame
raw_data <- read_data(
  system.file("extdata", "example_prescriptions.csv", package = "daiquiri"),
  delim = ",",
  col_names = TRUE
)

# validate and prepare the data for aggregation
source_data <- prepare_data(
  raw_data,
  field_types = field_types(
    PrescriptionID = ft_uniqueidentifier(),
    PrescriptionDate = ft_timepoint(),
    AdmissionDate = ft_datetime(includes_time = FALSE),
    Drug = ft_freetext(),
    Dose = ft_numeric(),
    DoseUnit = ft_categorical(),
    PatientID = ft_ignore(),
    Location = ft_categorical(aggregate_by_each_category = TRUE)
  ),
  override_column_names = FALSE,
  na = c("", "NULL"),
  dataset_description = "Example data provided with package"
)

source_data

[Package daiquiri version 1.1.1 Index]