wqs_sim {wqspt} | R Documentation |
WQS simulated dataset generator
Description
wqs_sim
generates a simulated dataset of mixture components, covariates,
and outcomes based on an initial set of specifications.
Usage
wqs_sim(
nmix = 10,
ncovrt = 10,
nobs = 500,
ntruewts = 10,
ntruecovrt = 5,
vcov = 0,
eps = 1,
truewqsbeta = NULL,
truebeta0 = NULL,
truewts = NULL,
truegamma = NULL,
rnd_wqsbeta_dir = "none",
seed = 101,
q = 10,
family = "gaussian"
)
Arguments
nmix |
Number of mixture components in simulated dataset. |
ncovrt |
Number of covariates in simulated dataset. |
nobs |
Number of observations in simulated dataset. |
ntruewts |
Number of mixture components that have a non-zero association with the outcome (i.e., are not noise). |
ntruecovrt |
Number of covariates that have a non-zero association with the outcome (i.e., are not noise). |
vcov |
This parameter relates to the variance-covariance matrix of the simulated independent variables (i.e., the m exposure mixture components and z covariates). This is either a variance-covariance matrix of dimensions (m + z) x (m + z) or a single value. If this is a single value, the variance- covariance matrix will have ones on the diagonal and that single value will be all the off-diagonal values. For example, if this input were 0.4 and there were two mixture components and no covariates, the variance-covariance matrix would be matrix(c(1, 0.4, 0.4, 1), nrow = 2, ncol = 2). The default value is 0, giving a variance-covariance matrix with variances of 1 and covariances of 0. |
eps |
Dispersion parameter. If the family is "gaussian", this corresponds to the residual standard deviation. If the family is "binomial" or "poisson", this parameter is ignored. If the family is "negbin", this represents the "size" parameter of the negative binomial distribution (see the documentation for the rnbinom function for more details). |
truewqsbeta |
Simulated WQS beta_1 value. If NULL, then this value will be randomly sampled depending on the parameter rnd_wqsbeta_dir. |
truebeta0 |
Simulated beta_0 value. If NULL, then this value will be randomly sampled from a standard normal distribution. |
truewts |
Simulated vector of mixture weights. If NULL, then this value will be randomly sampled from a Dirichlet distribution with a vector of alpha values all equal to 1 (see the documentation for the extraDistr::rdirichlet function documentation for more details). |
truegamma |
Simulated gamma vector. If NULL, then this value will be randomly sampled from a standard normal distribution. |
rnd_wqsbeta_dir |
Direction of randomly sampled truewqsbeta (if truewqsbeta = NULL). The options are "positive", "negative", or NULL. If "positive" or "negative", the truewqsbeta will be sampled from a standard half normal distribution in either of those respective directions. If NULL, then truewqsbeta will be sampled from a standard normal distribution. |
seed |
Random seed. |
q |
Number of quantiles. |
family |
Family for the generative model creating the outcome vector. Options include "gaussian" or gaussian(link = "identity") for a continuous outcome, "binomial" or binomial() with any accepted link function for a binary outcome, and finally for count outcomes this can be "poisson" or poisson(link="log") for the Poisson distributed outcome values, or "negbin" for negative binomial distributed outcome values. |
Value
wqs_perm
returns a list of:
weights |
Simulated weights. |
coef |
Simulated beta coefficients. |
Data |
Simulated dataset. |
etahat |
predicted linear predictor (eta) values from the data generating model. |
wqs |
Weighted quantile sum vector (quantile-transformed mixture components multiplied by weights). |
modmat |
Model matrix. |
Xq |
Quantile-transformed mixture components. |
Examples
# For these examples, we only run a GLM using the simulated dataset
# including the simulated WQS vector just to show that the user-specified
# coefficients for beta1 and beta0 are returned. An example of running
# the full permutation test WQS regression for the simulated dataset
# (for which the WQS vector would be determined by the model)
# with the "gaussian" family is shown as well.
wqsform<-formula(paste0("y~wqs+",paste(paste0("C",1:10),collapse="+")))
testsim_gaussian<-
wqs_sim(truewqsbeta=0.2,truebeta0=-2,
truewts=c(rep(0.15,5),rep(0.05,5)),family="gaussian")
Dat<-testsim_gaussian$Data
Dat$wqs<-testsim_gaussian$wqs
summary(glm(wqsform,data=Dat,family="gaussian"))$coef[1:2,]
perm_test_res <- wqs_full_perm(formula = wqsform, data = testsim_gaussian$Data,
mix_name = paste0("T",1:10), q = 10, b_main = 5,
b_perm = 5, b1_pos = TRUE, b1_constr = FALSE,
niter = 4, seed = 16, plan_strategy = "multicore",
stop_if_nonsig = FALSE)
# Note: The default values of b_main = 1000, b_perm = 200, and niter = 200
# are the recommended parameter values. This example has a lower b_main,
# b_perm, and niter in order to serve as a shorter example run.
testsim_logit<-
wqs_sim(truewqsbeta=0.2,truebeta0=-2,
truewts=c(rep(0.15,5),rep(0.05,5)),family="binomial")
Dat<-testsim_logit$Data
Dat$wqs<-testsim_logit$wqs
summary(glm(wqsform,data=Dat,family="binomial"))$coef[1:2,]