R: Merge Summary Variables

derive_var_merged_summary {admiral}

R Documentation

Merge Summary Variables

Description

Merge a summary variable from a dataset to the input dataset.

Usage

derive_var_merged_summary(
  dataset,
  dataset_add,
  by_vars,
  new_vars = NULL,
  new_var,
  filter_add = NULL,
  missing_values = NULL,
  analysis_var,
  summary_fun
)

Arguments

`dataset`	Input dataset The variables specified by the `by_vars` argument are expected to be in the dataset.
`dataset_add`	Additional dataset The variables specified by the `by_vars` and the variables used on the left hand sides of the `new_vars` arguments are expected.
`by_vars`	Grouping variables The expressions on the left hand sides of `new_vars` are evaluated by the specified variables. Then the resulting values are merged to the input dataset (`dataset`) by the specified variables. Permitted Values: list of variables created by `exprs()` e.g. `exprs(USUBJID, VISIT)`
`new_vars`	New variables to add The specified variables are added to the input dataset. A named list of expressions is expected: LHS refer to a variable. RHS refers to the values to set to the variable. This can be a string, a symbol, a numeric value, an expression or NA. If summary functions are used, the values are summarized by the variables specified for `by_vars`. For example: new_vars = exprs( DOSESUM = sum(AVAL), DOSEMEAN = mean(AVAL) )
`new_var`	Variable to add Please use `new_vars` instead. The specified variable is added to the input dataset (`dataset`) and set to the summarized values.
`filter_add`	Filter for additional dataset (`dataset_add`) Only observations fulfilling the specified condition are taken into account for summarizing. If the argument is not specified, all observations are considered. Permitted Values: a condition
`missing_values`	Values for non-matching observations For observations of the input dataset (`dataset`) which do not have a matching observation in the additional dataset (`dataset_add`) the values of the specified variables are set to the specified value. Only variables specified for `new_vars` can be specified for `missing_values`. Permitted Values: named list of expressions, e.g., `exprs(BASEC = "MISSING", BASE = -1)`
`analysis_var`	Analysis variable Please use `new_vars` instead. The values of the specified variable are summarized by the function specified for `summary_fun`.
`summary_fun`	Summary function Please use `new_vars` instead. The specified function that takes as input `analysis_var` and performs the calculation. This can include built-in functions as well as user defined functions, for example `mean` or `function(x) mean(x, na.rm = TRUE)`.

Details

The records from the additional dataset (dataset_add) are restricted to those matching the filter_add condition.
The new variables (new_vars) are created for each by group (by_vars) in the additional dataset (dataset_add) by calling summarize(). I.e., all observations of a by group are summarized to a single observation.
The new variables are merged to the input dataset. For observations without a matching observation in the additional dataset the new variables are set to NA. Observations in the additional dataset which have no matching observation in the input dataset are ignored.

Value

The output dataset contains all observations and variables of the input dataset and additionally the variables specified for new_vars.

Examples

library(tibble)

# Add a variable for the mean of AVAL within each visit
adbds <- tribble(
  ~USUBJID,  ~AVISIT,  ~ASEQ, ~AVAL,
  "1",      "WEEK 1",      1,    10,
  "1",      "WEEK 1",      2,    NA,
  "1",      "WEEK 2",      3,    NA,
  "1",      "WEEK 3",      4,    42,
  "1",      "WEEK 4",      5,    12,
  "1",      "WEEK 4",      6,    12,
  "1",      "WEEK 4",      7,    15,
  "2",      "WEEK 1",      1,    21,
  "2",      "WEEK 4",      2,    22
)

derive_var_merged_summary(
  adbds,
  dataset_add = adbds,
  by_vars = exprs(USUBJID, AVISIT),
  new_vars = exprs(
    MEANVIS = mean(AVAL, na.rm = TRUE),
    MAXVIS = max(AVAL, na.rm = TRUE)
  )
)

# Add a variable listing the lesion ids at baseline
adsl <- tribble(
  ~USUBJID,
  "1",
  "2",
  "3"
)

adtr <- tribble(
  ~USUBJID,     ~AVISIT, ~LESIONID,
  "1",       "BASELINE",  "INV-T1",
  "1",       "BASELINE",  "INV-T2",
  "1",       "BASELINE",  "INV-T3",
  "1",       "BASELINE",  "INV-T4",
  "1",         "WEEK 1",  "INV-T1",
  "1",         "WEEK 1",  "INV-T2",
  "1",         "WEEK 1",  "INV-T4",
  "2",       "BASELINE",  "INV-T1",
  "2",       "BASELINE",  "INV-T2",
  "2",       "BASELINE",  "INV-T3",
  "2",         "WEEK 1",  "INV-T1",
  "2",         "WEEK 1",  "INV-N1"
)

derive_var_merged_summary(
  adsl,
  dataset_add = adtr,
  by_vars = exprs(USUBJID),
  filter_add = AVISIT == "BASELINE",
  new_vars = exprs(LESIONSBL = paste(LESIONID, collapse = ", "))
)

[Package admiral version 1.1.1 Index]