wqs_sim {wqspt}R Documentation

WQS simulated dataset generator

Description

wqs_sim generates a simulated dataset of mixture components, covariates, and outcomes based on an initial set of specifications.

Usage

wqs_sim(
  nmix = 10,
  ncovrt = 10,
  nobs = 500,
  ntruewts = 10,
  ntruecovrt = 5,
  vcov = 0,
  eps = 1,
  truewqsbeta = NULL,
  truebeta0 = NULL,
  truewts = NULL,
  truegamma = NULL,
  rnd_wqsbeta_dir = "none",
  seed = 101,
  q = 10,
  family = "gaussian"
)

Arguments

nmix

Number of mixture components in simulated dataset.

ncovrt

Number of covariates in simulated dataset.

nobs

Number of observations in simulated dataset.

ntruewts

Number of mixture components that have a non-zero association with the outcome (i.e., are not noise).

ntruecovrt

Number of covariates that have a non-zero association with the outcome (i.e., are not noise).

vcov

This parameter relates to the variance-covariance matrix of the simulated independent variables (i.e., the m exposure mixture components and z covariates). This is either a variance-covariance matrix of dimensions (m + z) x (m + z) or a single value. If this is a single value, the variance- covariance matrix will have ones on the diagonal and that single value will be all the off-diagonal values. For example, if this input were 0.4 and there were two mixture components and no covariates, the variance-covariance matrix would be matrix(c(1, 0.4, 0.4, 1), nrow = 2, ncol = 2). The default value is 0, giving a variance-covariance matrix with variances of 1 and covariances of 0.

eps

Dispersion parameter. If the family is "gaussian", this corresponds to the residual standard deviation. If the family is "binomial" or "poisson", this parameter is ignored. If the family is "negbin", this represents the "size" parameter of the negative binomial distribution (see the documentation for the rnbinom function for more details).

truewqsbeta

Simulated WQS beta_1 value. If NULL, then this value will be randomly sampled depending on the parameter rnd_wqsbeta_dir.

truebeta0

Simulated beta_0 value. If NULL, then this value will be randomly sampled from a standard normal distribution.

truewts

Simulated vector of mixture weights. If NULL, then this value will be randomly sampled from a Dirichlet distribution with a vector of alpha values all equal to 1 (see the documentation for the extraDistr::rdirichlet function documentation for more details).

truegamma

Simulated gamma vector. If NULL, then this value will be randomly sampled from a standard normal distribution.

rnd_wqsbeta_dir

Direction of randomly sampled truewqsbeta (if truewqsbeta = NULL). The options are "positive", "negative", or NULL. If "positive" or "negative", the truewqsbeta will be sampled from a standard half normal distribution in either of those respective directions. If NULL, then truewqsbeta will be sampled from a standard normal distribution.

seed

Random seed.

q

Number of quantiles.

family

Family for the generative model creating the outcome vector. Options include "gaussian" or gaussian(link = "identity") for a continuous outcome, "binomial" or binomial() with any accepted link function for a binary outcome, and finally for count outcomes this can be "poisson" or poisson(link="log") for the Poisson distributed outcome values, or "negbin" for negative binomial distributed outcome values.

Value

wqs_perm returns a list of:

weights

Simulated weights.

coef

Simulated beta coefficients.

Data

Simulated dataset.

etahat

predicted linear predictor (eta) values from the data generating model.

wqs

Weighted quantile sum vector (quantile-transformed mixture components multiplied by weights).

modmat

Model matrix.

Xq

Quantile-transformed mixture components.

Examples


# For these examples, we only run a GLM using the simulated dataset
# including the simulated WQS vector just to show that the user-specified
# coefficients for beta1 and beta0 are returned. An example of running
# the full permutation test WQS regression for the simulated dataset
# (for which the WQS vector would be determined by the model)
# with the "gaussian" family is shown as well.

wqsform<-formula(paste0("y~wqs+",paste(paste0("C",1:10),collapse="+")))

testsim_gaussian<-
  wqs_sim(truewqsbeta=0.2,truebeta0=-2,
          truewts=c(rep(0.15,5),rep(0.05,5)),family="gaussian")
Dat<-testsim_gaussian$Data
Dat$wqs<-testsim_gaussian$wqs
summary(glm(wqsform,data=Dat,family="gaussian"))$coef[1:2,]

perm_test_res <- wqs_full_perm(formula = wqsform, data = testsim_gaussian$Data, 
                               mix_name = paste0("T",1:10), q = 10, b_main = 5, 
                               b_perm = 5, b1_pos = TRUE, b1_constr = FALSE, 
                               niter = 4, seed = 16, plan_strategy = "multicore", 
                               stop_if_nonsig = FALSE)

# Note: The default values of b_main = 1000, b_perm = 200, and niter = 200 
# are the recommended parameter values. This example has a lower b_main, 
# b_perm, and niter in order to serve as a shorter example run. 

 
testsim_logit<-
  wqs_sim(truewqsbeta=0.2,truebeta0=-2,
          truewts=c(rep(0.15,5),rep(0.05,5)),family="binomial")
Dat<-testsim_logit$Data
Dat$wqs<-testsim_logit$wqs
summary(glm(wqsform,data=Dat,family="binomial"))$coef[1:2,]



[Package wqspt version 1.0.1 Index]