gendata_cmgfm {CMGFM} | R Documentation |
Generate simulated data
Description
Generate simulated data from covariate-augumented generalized factor model
Usage
gendata_cmgfm(
seed = 1,
n = 300,
pveclist = list(gaussian = c(50, 150), poisson = c(50), binomial = c(100, 60)),
q = 6,
d = 3,
rho = rep(1, length(pveclist)),
rho_z = 1,
sigmavec = rep(0.5, length(pveclist)),
n_bin = 1,
sigma_eps = 1,
seed.para = 1
)
Arguments
seed |
a positive integer, the random seed for reproducibility of data generation process. |
n |
a positive integer, specify the sample size. |
pveclist |
a named list, specify the number of modalities for each variable type and dimension of variables in each modality. |
q |
a positive integer, specify the number of modality-shared factors. |
d |
a positive integer, specify the dimension of covariate matrix. |
rho |
a numeric vector with length |
rho_z |
a positive real, specify the signal strength of covariates. |
sigmavec |
a positive vector with length |
n_bin |
a positive integer, specify the number of trails in Binomial distribution. |
sigma_eps |
a positive real, the variance of overdispersion error. |
seed.para |
a positive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient vector and loading matrices. |
Details
None
Value
return a list including the following components:
-
XList
- a list consisting of multiple matrices in which each matrix has the same type of values, i.e., continuous, or count, or binomial/binary values. -
Z
- a matrix, the fixed-dimensional covariate matrix with control variables; -
Alist
- the the offset vector for each modality; -
B0list
- the true loading matrix for each modality; -
mu0
- the true intercept vector for each modality; -
U0
- the modality-specified factor vector; -
F0
- the modality-shared factor matrix; -
Uplist
- the true intercept-loading matrix for each modality; -
beta
- the true regression coefficient vector for each modality; -
sigma_eps
- the standard deviation of error term; -
numvarmat
- a length(types)-by-d matrix, the number of variables in modalities that belong to the same type.
References
None
See Also
Examples
n <- 300;
pveclist = list('gaussian'=c(50, 150),'poisson'=c(50),'binomial'=c(100,60))
d <- 20; q <- 6;
datlist <- gendata_cmgfm(n=n, pveclist=pveclist, q=q, d=d)
str(datlist)