R: Predictive Value Weighting Estimation of the Binary Mediator...

COMMA_PVW {COMMA}

R Documentation

Predictive Value Weighting Estimation of the Binary Mediator Misclassification Model

Description

Estimate \beta, \gamma, and \theta parameters from the true mediator, observed mediator, and outcome mechanisms, respectively, in a binary mediator misclassification model using a predictive value weighting approach.

Usage

COMMA_PVW(
  Mstar,
  outcome,
  outcome_distribution,
  interaction_indicator,
  x_matrix,
  z_matrix,
  c_matrix,
  beta_start,
  gamma_start,
  theta_start,
  tolerance = 1e-07,
  max_em_iterations = 1500,
  em_method = "squarem"
)

Arguments

`Mstar`	A numeric vector of indicator variables (1, 2) for the observed mediator `M*`. There should be no `NA` terms. The reference category is 2.
`outcome`	A vector containing the outcome variables of interest. There should be no `NA` terms.
`outcome_distribution`	A character string specifying the distribution of the outcome variable. Options are `"Bernoulli"`, `"Poisson"`, or `"Normal"`.
`interaction_indicator`	A logical value indicating if an interaction between `x` and `m` should be used to generate the outcome variable, `y`.
`x_matrix`	A numeric matrix of predictors in the true mediator and outcome mechanisms. `x_matrix` should not contain an intercept and no values should be `NA`.
`z_matrix`	A numeric matrix of covariates in the observation mechanism. `z_matrix` should not contain an intercept and no values should be `NA`.
`c_matrix`	A numeric matrix of covariates in the true mediator and outcome mechanisms. `c_matrix` should not contain an intercept and no values should be `NA`.
`beta_start`	A numeric vector or column matrix of starting values for the `\beta` parameters in the true mediator mechanism. The number of elements in `beta_start` should be equal to the number of columns of `x_matrix` and `c_matrix` plus 1. Starting values should be provided in the following order: intercept, slope coefficient for the `x_matrix` term, slope coefficient for first column of the `c_matrix`, ..., slope coefficient for the final column of the `c_matrix`.
`gamma_start`	A numeric vector or matrix of starting values for the `\gamma` parameters in the observation mechanism. In matrix form, the `gamma_start` matrix rows correspond to parameters for the `M* = 1` observed mediator, with the dimensions of `z_matrix` plus 1, and the gamma parameter matrix columns correspond to the true mediator categories `M \in \{1, 2\}`. A numeric vector for `gamma_start` is obtained by concatenating the gamma matrix, i.e. `gamma_start <- c(gamma_matrix)`. Starting values should be provided in the following order within each column: intercept, slope coefficient for first column of the `z_matrix`, ..., slope coefficient for the final column of the `z_matrix`.
`theta_start`	A numeric vector or column matrix of starting values for the `\theta` parameters in the outcome mechanism. The number of elements in `theta_start` should be equal to the number of columns of `x_matrix` and `c_matrix` plus 2 (if `interaction_indicator` is `FALSE`) or 3 (if `interaction_indicator` is `TRUE`). Starting values should be provided in the following order: intercept, slope coefficient for the `x_matrix` term, slope coefficient for the mediator `m` term, slope coefficient for first column of the `c_matrix`, ..., slope coefficient for the final column of the `c_matrix`, and, optionally, slope coefficient for `xm`).
`tolerance`	A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is `1e-7`.
`max_em_iterations`	A numeric value specifying when to stop estimation, based on the difference of subsequent log-likelihood estimates. The default is `1e-7`.
`em_method`	A character string specifying which EM algorithm will be applied. Options are `"em"`, `"squarem"`, or `"pem"`. The default and recommended option is `"squarem"`.

Details

Note that this method can only be used for binary outcome models.

Value

COMMA_PVW returns a data frame containing four columns. The first column, Parameter, represents a unique parameter value for each row. The next column contains the parameter Estimates. The third column, Convergence, reports whether or not the algorithm converged for a given parameter estimate. The final column, Method, reports that the estimates are obtained from the "PVW" procedure.

Examples

set.seed(20240709)
sample_size <- 2000

n_cat <- 2 # Number of categories in the binary mediator

# Data generation settings
x_mu <- 0
x_sigma <- 1
z_shape <- 1
c_shape <- 1

# True parameter values (gamma terms set the misclassification rate)
true_beta <- matrix(c(1, -2, .5), ncol = 1)
true_gamma <- matrix(c(1, 1, -.5, -1.5), nrow = 2, byrow = FALSE)
true_theta <- matrix(c(1, 1.5, -2, -.2), ncol = 1)

example_data <- COMMA_data(sample_size, x_mu, x_sigma, z_shape, c_shape,
                           interaction_indicator = FALSE,
                           outcome_distribution = "Bernoulli",
                           true_beta, true_gamma, true_theta)
                           
beta_start <- matrix(rep(1, 3), ncol = 1)
gamma_start <- matrix(rep(1, 4), nrow = 2, ncol = 2)
theta_start <- matrix(rep(1, 4), ncol = 1)

Mstar = example_data[["obs_mediator"]]
outcome = example_data[["outcome"]]
x_matrix = example_data[["x"]]
z_matrix = example_data[["z"]]
c_matrix = example_data[["c"]]
                           
PVW_results <- COMMA_PVW(Mstar, outcome, outcome_distribution = "Bernoulli",
                         interaction_indicator = FALSE,
                         x_matrix, z_matrix, c_matrix,
                         beta_start, gamma_start, theta_start)

PVW_results

[Package COMMA version 1.0.0 Index]