cb.sims.sim_linear {causalBatch}R Documentation

Linear Simulation

Description

Linear Simulation

Usage

cb.sims.sim_linear(
  n = 100,
  pi = 0.5,
  eff_sz = 1,
  alpha = 2,
  unbalancedness = 1,
  err = 1/2,
  null = FALSE,
  a = -2,
  b = -1,
  nbreaks = 200
)

Arguments

n

the number of samples. Defaults to 100.

pi

the balance between the classes, where samples will be from group 1 with probability pi, and group 2 with probability 1 - pi. Defaults to 0.5.

eff_sz

the treatment effect between the different groups. Defaults to 1.

alpha

the alpha for the covariate sampling procedure. Defaults to 2.

unbalancedness

the level of covariate dissimilarity between the covariates for each of the groups. Defaults to 1.

err

the level of noise for the simulation. Defaults to 1/2.

null

whether to generate a null simulation. Defaults to FALSE. Same behavior can be achieved by setting eff_sz = 0.

a

the first parameter for the covariate/outcome relationship. Defaults to -2.

b

the second parameter for the covariate/outcome relationship. Defaults to -1.

nbreaks

the number of breakpoints for computing the expected outcome at a given covariate level for each batch. Defaults to 200.

Value

a list, containing the following:

Ys

an [n, 2] matrix, containing the outcomes for each sample. The first dimension contains the "treatment effect".

Ts

an [n, 1] matrix, containing the group/batch labels for each sample.

Xs

an [n, 1] matrix, containing the covariate values for each sample.

Eps

an [n, 1] matrix, containing the error for each sample.

x.bounds

the theoretical bounds for the covariate values.

Ytrue

an [nbreaks*2, 2] matrix, containing the expected outcomes at a covariate level indicated by Xtrue.

Ttrue

an [nbreaks*2,1] matrix, indicating the group/batch the expected outcomes and covariate breakpoints correspond to.

Xtrue

an [nbreaks*2, 1] matrix, indicating the values of the covariate breakpoints for the theoretical expected outcome in Ytrue.

Overlap

the theoretical degree of overlap between the covariate distributions for each of the two groups/batches.

Details

A linear relationship between the covariate and the outcome. The first dimension of the outcome is:

Y_i = a\times (X_i + b) - \text{eff\_sz} \times T_i + \frac{1}{2} \epsilon_i

where the batch/group labels are:

T_i \overset{iid}{\sim} Bern(\pi)

The beta coefficient for the covariate sampling is:

\beta = \alpha \times \text{unbalancedness}

The covariate values for the first batch are:

X_i | T_i = 0 \overset{ind}{\sim} 2 Beta(\alpha, \beta) - 1

and the covariate values for the second batch are:

X_i | T_i = 1 \overset{ind}{\sim} 2 Beta(\beta, \alpha) - 1

Finally, the error terms are:

\epsilon_i \overset{iid}{\sim} Norm(0, \text{err}^2)

For more details see the help vignette: vignette("causal_simulations", package = "causalBatch")

Author(s)

Eric W. Bridgeford

References

Eric W. Bridgeford, et al. "A Causal Perspective for Batch Effects: When is no answer better than a wrong answer?" Biorxiv (2024).

Examples


library(causalBatch)
sim = cb.sims.sim_linear()


[Package causalBatch version 1.2.0 Index]