generate_data {CIEE}  R Documentation 
Function to generate data with n
observations of a primary
outcome Y
, secondary outcome K
, exposure X
, and
measured as well as unmeasured confounders L
and U
, where
the primary outcome is a quantitative normallydistributed variable
(setting
= "GLM"
) or censored timetoevent outcome under
an accelerated failure time (AFT) model (setting
= "AFT"
).
Under the AFT setting, the observed timetoevent variable T=exp(Y)
as well as the censoring indicator C
are also computed. X
is generated as a genetic exposure variable in the form of a single
nucleotide variant (SNV) in 012 additive coding with minor allele
frequency maf
. X
can be generated independently of U
(X_orth_U
= TRUE
) or dependent on U
(X_orth_U
= FALSE
). For more details regarding the underlying
model, see the vignette.
generate_data(setting = "GLM", n = 1000, maf = 0.2, cens = 0.3, a = NULL, b = NULL, aXK = 0.2, aXY = 0.1, aXL = 0, aKY = 0.3, aLK = 0, aLY = 0, aUY = 0, aUL = 0, mu_X = NULL, sd_X = NULL, X_orth_U = TRUE, mu_U = 0, sd_U = 1, mu_K = 0, sd_K = 1, mu_L = 0, sd_L = 1, mu_Y = 0, sd_Y = 1)
setting 
String with value 
n 
Numeric. Sample size. 
maf 
Numeric. Minor allele frequency of the genetic exposure variable. 
cens 
Numeric. Desired percentage of censored individuals and has to be
specified under the AFT setting. Note that the actual censoring
rate is generated through specification of the parameters

a 
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting. 
b 
Integer for generating the desired censoring rate under the AFT setting. Has to be specified under the AFT setting. 
aXK 
Numeric. Size of the effect of 
aXY 
Numeric. Size of the effect of 
aXL 
Numeric. Size of the effect of 
aKY 
Numeric. Size of the effect of 
aLK 
Numeric. Size of the effect of 
aLY 
Numeric. Size of the effect of 
aUY 
Numeric. Size of the effect of 
aUL 
Numeric. Size of the effect of 
mu_X 
Numeric. Expected value of 
sd_X 
Numeric. Standard deviation of 
X_orth_U 
Logical. Indicator whether 
mu_U 
Numeric. Expected value of 
sd_U 
Numeric. Standard deviation of 
mu_K 
Numeric. Expected value of 
sd_K 
Numeric. Standard deviation of 
mu_L 
Numeric. Expected value of 
sd_L 
Numeric. Standard deviation of 
mu_Y 
Numeric. Expected value of 
sd_Y 
Numeric. Standard deviation of 
A dataframe containing n
observations of the variables Y
,
K
, X
, L
, U
. Under the AFT setting,
T=exp(Y)
and the censoring indicator C
(0 = censored,
1 = uncensored) are also computed.
# Generate data under the GLM setting with default values dat_GLM < generate_data() head(dat_GLM) # Generate data under the AFT setting with default values dat_AFT < generate_data(setting = "AFT", a = 0.2, b = 4.75) head(dat_AFT)