generate_data {CIEE} | R Documentation |
Data generation function
Description
Function to generate data with n
observations of a primary
outcome Y
, secondary outcome K
, exposure X
, and
measured as well as unmeasured confounders L
and U
, where
the primary outcome is a quantitative normally-distributed variable
(setting
= "GLM"
) or censored time-to-event outcome under
an accelerated failure time (AFT) model (setting
= "AFT"
).
Under the AFT setting, the observed time-to-event variable T=exp(Y)
as well as the censoring indicator C
are also computed. X
is generated as a genetic exposure variable in the form of a single
nucleotide variant (SNV) in 0-1-2 additive coding with minor allele
frequency maf
. X
can be generated independently of U
(X_orth_U
= TRUE
) or dependent on U
(X_orth_U
= FALSE
). For more details regarding the underlying
model, see the vignette.
Usage
generate_data(setting = "GLM", n = 1000, maf = 0.2, cens = 0.3,
a = NULL, b = NULL, aXK = 0.2, aXY = 0.1, aXL = 0, aKY = 0.3,
aLK = 0, aLY = 0, aUY = 0, aUL = 0, mu_X = NULL, sd_X = NULL,
X_orth_U = TRUE, mu_U = 0, sd_U = 1, mu_K = 0, sd_K = 1, mu_L = 0,
sd_L = 1, mu_Y = 0, sd_Y = 1)
Arguments
setting |
String with value |
n |
Numeric. Sample size. |
maf |
Numeric. Minor allele frequency of the genetic exposure variable. |
cens |
Numeric. Desired percentage of censored individuals and has to be
specified under the AFT setting. Note that the actual censoring
rate is generated through specification of the parameters
|
a |
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting. |
b |
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting. |
aXK |
Numeric. Size of the effect of |
aXY |
Numeric. Size of the effect of |
aXL |
Numeric. Size of the effect of |
aKY |
Numeric. Size of the effect of |
aLK |
Numeric. Size of the effect of |
aLY |
Numeric. Size of the effect of |
aUY |
Numeric. Size of the effect of |
aUL |
Numeric. Size of the effect of |
mu_X |
Numeric. Expected value of |
sd_X |
Numeric. Standard deviation of |
X_orth_U |
Logical. Indicator whether |
mu_U |
Numeric. Expected value of |
sd_U |
Numeric. Standard deviation of |
mu_K |
Numeric. Expected value of |
sd_K |
Numeric. Standard deviation of |
mu_L |
Numeric. Expected value of |
sd_L |
Numeric. Standard deviation of |
mu_Y |
Numeric. Expected value of |
sd_Y |
Numeric. Standard deviation of |
Value
A dataframe containing n
observations of the variables Y
,
K
, X
, L
, U
. Under the AFT setting,
T=exp(Y)
and the censoring indicator C
(0 = censored,
1 = uncensored) are also computed.
Examples
# Generate data under the GLM setting with default values
dat_GLM <- generate_data()
head(dat_GLM)
# Generate data under the AFT setting with default values
dat_AFT <- generate_data(setting = "AFT", a = 0.2, b = 4.75)
head(dat_AFT)