derive_vars_merged {admiral}R Documentation

Add New Variable(s) to the Input Dataset Based on Variables from Another Dataset

Description

Add new variable(s) to the input dataset based on variables from another dataset. The observations to merge can be selected by a condition (filter_add argument) and/or selecting the first or last observation for each by group (order and mode argument).

Usage

derive_vars_merged(
  dataset,
  dataset_add,
  by_vars,
  order = NULL,
  new_vars = NULL,
  mode = NULL,
  filter_add = NULL,
  match_flag = NULL,
  check_type = "warning",
  duplicate_msg = NULL
)

Arguments

dataset

Input dataset

The variables specified by the by_vars parameter are expected.

dataset_add

Additional dataset

The variables specified by the by_vars, the new_vars, and the order parameter are expected.

by_vars

Grouping variables

The input dataset and the selected observations from the additional dataset are merged by the specified by variables. The by variables must be a unique key of the selected observations.

Permitted Values: list of variables created by vars()

order

Sort order

If the parameter is set to a non-null value, for each by group the first or last observation from the additional dataset is selected with respect to the specified order.

Default: NULL

Permitted Values: list of variables or ⁠desc(<variable>)⁠ function calls created by vars(), e.g., vars(ADT, desc(AVAL)) or NULL

new_vars

Variables to add

The specified variables from the additional dataset are added to the output dataset. Variables can be renamed by naming the element, i.e., ⁠new_vars = vars(<new name> = <old name>)⁠.

For example new_vars = vars(var1, var2) adds variables var1 and var2 from dataset_add to the input dataset.

And new_vars = vars(var1, new_var2 = old_var2) takes var1 and old_var2 from dataset_add and adds them to the input dataset renaming old_var2 to new_var2.

If the parameter is not specified or set to NULL, all variables from the additional dataset (dataset_add) are added.

Default: NULL

Permitted Values: list of variables created by vars()

mode

Selection mode

Determines if the first or last observation is selected. If the order parameter is specified, mode must be non-null.

If the order parameter is not specified, the mode parameter is ignored.

Default: NULL

Permitted Values: "first", "last", NULL

filter_add

Filter for additional dataset (dataset_add)

Only observations fulfilling the specified condition are taken into account for merging. If the parameter is not specified, all observations are considered.

Default: NULL

Permitted Values: a condition

match_flag

Match flag

If the parameter is specified (e.g., match_flag = FLAG), the specified variable (e.g., FLAG) is added to the input dataset. This variable will be TRUE for all selected records from dataset_add which are merged into the input dataset, and NA otherwise.

Default: NULL

Permitted Values: Variable name

check_type

Check uniqueness?

If "warning" or "error" is specified, the specified message is issued if the observations of the (restricted) additional dataset are not unique with respect to the by variables and the order.

Default: "warning"

Permitted Values: "none", "warning", "error"

duplicate_msg

Message of unique check

If the uniqueness check fails, the specified message is displayed.

Default:

paste("Dataset `dataset_add` contains duplicate records with respect to",
      enumerate(vars2chr(by_vars)))

Details

  1. The records from the additional dataset (dataset_add) are restricted to those matching the filter_add condition.

  2. If order is specified, for each by group the first or last observation (depending on mode) is selected.

  3. The variables specified for new_vars are renamed (if requested) and merged to the input dataset using left_join(). I.e., the output dataset contains all observations from the input dataset. For observations without a matching observation in the additional dataset the new variables are set to NA. Observations in the additional dataset which have no matching observation in the input dataset are ignored.

Value

The output dataset contains all observations and variables of the input dataset and additionally the variables specified for new_vars from the additional dataset (dataset_add).

Author(s)

Stefan Bundfuss

See Also

General Derivation Functions for all ADaMs that returns variable appended to dataset: derive_var_confirmation_flag(), derive_var_extreme_flag(), derive_var_last_dose_amt(), derive_var_last_dose_date(), derive_var_last_dose_grp(), derive_var_merged_cat(), derive_var_merged_character(), derive_var_merged_exist_flag(), derive_var_obs_number(), derive_var_worst_flag(), derive_vars_last_dose(), derive_vars_merged_lookup(), derive_vars_transposed(), get_summary_records()

Examples

library(admiral.test)
library(dplyr, warn.conflicts = FALSE)
data("admiral_vs")
data("admiral_dm")

# Merging all dm variables to vs
derive_vars_merged(
  admiral_vs,
  dataset_add = select(admiral_dm, -DOMAIN),
  by_vars = vars(STUDYID, USUBJID)
) %>%
  select(STUDYID, USUBJID, VSTESTCD, VISIT, VSTPT, VSSTRESN, AGE, AGEU)

# Merge last weight to adsl
data("admiral_adsl")
derive_vars_merged(
  admiral_adsl,
  dataset_add = admiral_vs,
  by_vars = vars(STUDYID, USUBJID),
  order = vars(VSDTC),
  mode = "last",
  new_vars = vars(LASTWGT = VSSTRESN, LASTWGTU = VSSTRESU),
  filter_add = VSTESTCD == "WEIGHT",
  match_flag = vsdatafl
) %>%
  select(STUDYID, USUBJID, AGE, AGEU, LASTWGT, LASTWGTU, vsdatafl)

# Derive treatment start datetime (TRTSDTM)
data(admiral_ex)

## Impute exposure start date to first date/time
ex_ext <- derive_vars_dtm(
  admiral_ex,
  dtc = EXSTDTC,
  new_vars_prefix = "EXST",
  highest_imputation = "M",
)

## Add first exposure datetime and imputation flags to adsl
derive_vars_merged(
  select(admiral_dm, STUDYID, USUBJID),
  dataset_add = ex_ext,
  by_vars = vars(STUDYID, USUBJID),
  new_vars = vars(TRTSDTM = EXSTDTM, TRTSDTF = EXSTDTF, TRTSTMF = EXSTTMF),
  order = vars(EXSTDTM),
  mode = "first"
)

# Derive treatment start datetime (TRTSDTM)
data(admiral_ex)

## Impute exposure start date to first date/time
ex_ext <- derive_vars_dtm(
  admiral_ex,
  dtc = EXSTDTC,
  new_vars_prefix = "EXST",
  highest_imputation = "M",
)

## Add first exposure datetime and imputation flags to adsl
derive_vars_merged(
  select(admiral_dm, STUDYID, USUBJID),
  dataset_add = ex_ext,
  filter_add = !is.na(EXSTDTM),
  by_vars = vars(STUDYID, USUBJID),
  new_vars = vars(TRTSDTM = EXSTDTM, TRTSDTF = EXSTDTF, TRTSTMF = EXSTTMF),
  order = vars(EXSTDTM),
  mode = "first"
)

# Derive treatment end datetime (TRTEDTM)
## Impute exposure end datetime to last time, no date imputation
ex_ext <- derive_vars_dtm(
  admiral_ex,
  dtc = EXENDTC,
  new_vars_prefix = "EXEN",
  time_imputation = "last",
)

## Add last exposure datetime and imputation flag to adsl
derive_vars_merged(
  select(admiral_dm, STUDYID, USUBJID),
  dataset_add = ex_ext,
  filter_add = !is.na(EXENDTM),
  by_vars = vars(STUDYID, USUBJID),
  new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
  order = vars(EXENDTM),
  mode = "last"
)

[Package admiral version 0.8.4 Index]