R: Create and preprocess an MMRM dataset.

brm_data {brms.mmrm}

R Documentation

Create and preprocess an MMRM dataset.

Description

Create a dataset to analyze with an MMRM.

Usage

brm_data(
  data,
  outcome,
  baseline = NULL,
  group,
  subgroup = NULL,
  time,
  patient,
  covariates = character(0L),
  missing = NULL,
  reference_group,
  reference_subgroup = NULL,
  reference_time = NULL,
  role = NULL,
  level_baseline = NULL,
  level_control = NULL
)

Arguments

`data`	Data frame or tibble with longitudinal data.
`outcome`	Character of length 1, name of the continuous outcome variable. Example possibilities from clinical trial datasets include `"CHG"` and `"AVAL"`. The `outcome` column in the data should be a numeric vector.
`baseline`	Character of length 1, name of the baseline response variable (for example, `"BASE"` in many clinical trial datasets). Only relevant if the response variable is change from baseline. Supply `NULL` to ignore or omit.
`group`	Character of length 1, name of the treatment group variable. Example possibilities from clinical trial datasets include `"TRT01P"`, `"TREATMENT"`, `"TRT"`, and `"GROUP"`. The `group` column in the data should be a character vector or unordered factor.
`subgroup`	Character of length 1, optional name of the a discrete subgroup variable. Set to `NULL` to omit the subgroup (default). If present, the `subgroup` column in the data should be a character vector or unordered factor.
`time`	Character of length 1, name of the discrete time variable. Example possibilities from clinical trial datasets include `"AVISIT"` and `"VISIT"`. For most analyses, please ensure the time column in the data is an ordered factor. You can easily turn the time variable into an ordered factor using `brm_data_chronologize()`, either before or immediately after `brm_data()` (but before any `⁠brm_archetype_*()⁠` functions). This ensures the time points sort in chronological order, which ensures the correctness of informative prior archetypes and autoregressive / moving average correlation structures. Ordinarily, ordered factors automatically use polynomial contrasts from `contr.poly()`. This is undesirable for MMRMs, so if the time variable is an ordered factor, then `brm_data()` manually sets `contrasts(data[[time]])` to a set of treatment contrasts using `contr.treatment()`. If you prefer different contrasts, please manually set `contrasts(data[[time]])` to something else after calling `brm_data()`.
`patient`	Character of length 1, name of the patient ID variable. Example possibilities from clinical trial datasets include `"USUBJID"`, `"SUBJID"`, `"PATIENT"`, `"PATIENTID"`, `"SUBJECT"`, `"SUBJIDID"`, `"SBJID"`, `"STYSID1A"`, `"SBJ1N"`, and `"ID"`. The `patient` column in the data should be a factor or character vector.
`covariates`	Character vector of names of other covariates. All these covariates are assumed to be non-time-varying. For time-varying covariates, please manually expand the data to the full grid of patients and time points before you call `brm_data()`. See the "Preprocessing" section for details.
`missing`	Character of length 1, name of an optional variable in a simulated dataset to indicate which outcome values should be missing. Set to `NULL` to omit.
`reference_group`	Atomic value of length 1, Level of the `group` column to indicate the control group. Example possibilities from clinical trial datasets include `"Placebo"`, `"PLACEBO"`, `"PBO"`, `"PLB"`, `"CONTROL"`, `"CTRL"`, `"REFERENCE"`, and `"REF"`. `reference_group` only applies to the post-processing that happens in functions like `brm_marginal_draws()` downstream of the model. It does not control the fixed effect mapping in the model matrix that `brms` derives from the formula from `brm_formula()`.
`reference_subgroup`	Atomic value of length 1, level of the `subgroup` column to use as a reference for pairwise differences in when computing marginal means downstream of the model. It does not control the fixed effect mapping in the model matrix that `brms` derives from the formula from `brm_formula()`.
`reference_time`	Atomic value of length 1 or `NULL`, level of the `time` column to indicate the baseline time point. Leave as `NULL` if there is no baseline or baseline is not included in `data[[time]]`. If `reference_time` is not `NULL`, then `brm_marginal_draws()` will calculate change from baseline, and it will calculate treatment differences as differences between change-from-baseline values. If `reference_time` is not `NULL`, then `brm_marginal_draws()` will not calculate change from baseline, and it will calculate treatment differences as differences between response values. Note: `reference_time` only applies to the post-processing that happens in functions like `brm_marginal_draws()` downstream of the model. It does not control the fixed effect mapping in the model matrix that `brms` derives from the formula from `brm_formula()`.
`role`	Deprecated as unnecessary on 2024-07-11 (version 1.0.1.9007). Use `reference_time` to supply a baseline time point value if it exists.
`level_baseline`	Deprecated on 2024-01-11 (version 0.2.0.9002). Use `reference_time` instead.
`level_control`	Deprecated on 2024-01-11 (version 0.2.0.9002). Use `reference_group` instead.

Value

A classed tibble with attributes which denote features of the data such as the treatment group and discrete time variables.

Preprocessing

The preprocessing steps in brm_data() are as follows:

Perform basic assertions to make sure the data and other arguments are properly formatted.
Convert the group and time columns to character vectors.
Sanitize the levels of the group and time columns using make.names(unique = FALSE, allow_ = TRUE) to ensure agreement between the data and the output of brms.
For each implicitly missing outcome observation, add explicit row with the outcome variable equal to NA_real_. Missing values in the predictors are implicitly filled using zoo::na.locf() on within each patient, which is not valid for time-varying covariates. If any covariates are time-varying, please manually perform this step before calling brm_data().
Arrange the rows of the data by group, then patient, then discrete time.
Select only the columns of the data relevant to an MMRM analysis.

Separation string

Post-processing in brm_marginal_draws() names each of the group-by-time marginal means with the delimiting character string from Sys.getenv("BRM_SEP", unset = "|"). Neither the column names nor element names of the group and time variables can contain this string. To set a custom string yourself, use Sys.setenv(BRM_SEP = "YOUR_CUSTOM_STRING").

Examples

set.seed(0)
data <- brm_simulate_simple()$data
colnames(data) <- paste0("col_", colnames(data))
data
brm_data(
  data = data,
  outcome = "col_response",
  group = "col_group",
  time = "col_time",
  patient = "col_patient",
  reference_group = "group_1",
  reference_time = "time_1"
)

[Package brms.mmrm version 1.1.0 Index]