estimate_lucid {LUCIDus} | R Documentation |
Fit LUCID models with one or multiple omics layers
Description
EM algorithm to estimate LUCID with one or multiple omics layers
Usage
estimate_lucid(
lucid_model = c("early", "parallel", "serial"),
G,
Z,
Y,
CoG = NULL,
CoY = NULL,
K,
init_omic.data.model = "EEV",
useY = TRUE,
tol = 0.001,
max_itr = 1000,
max_tot.itr = 10000,
Rho_G = 0,
Rho_Z_Mu = 0,
Rho_Z_Cov = 0,
family = c("normal", "binary"),
seed = 123,
init_impute = c("mix", "lod"),
init_par = c("mclust", "random"),
verbose = FALSE
)
Arguments
lucid_model |
Specifying LUCID model, "early" for early integration, "parallel" for lucid in parallel, "serial" for lucid in serial |
G |
an N by P matrix representing exposures |
Z |
Omics data, if "early", an N by M matrix; If "parallel", a list, each element i is a matrix with N rows and P_i features; If "serial", a list, each element i is a matrix with N rows and p_i features or a list with two or more matrices with N rows and a certain number of features |
Y |
a length N vector |
CoG |
an N by V matrix representing covariates to be adjusted for G -> X |
CoY |
an N by K matrix representing covariates to be adjusted for X -> Y |
K |
Number of latent clusters. If "early", an integer greater or equal to 2; If "parallel",an integer vector, same length as Z, with each element being an interger greater or equal to 2; If "serial", a list, each element is either an integer like that for "early" or an list of integers like that for "parallel", same length as Z |
init_omic.data.model |
a vector of strings specifies the geometric model of omics data. If NULL, See more in ?mclust::mclustModelNames |
useY |
logical, if TRUE, EM algorithm fits a supervised LUCID; otherwise unsupervised LUCID. |
tol |
stopping criterion for the EM algorithm |
max_itr |
Maximum iterations of the EM algorithm. If the EM algorithm iterates more than max_itr without converging, the EM algorithm is forced to stop. |
max_tot.itr |
Max number of total iterations for |
Rho_G |
A scalar. This parameter is the LASSO penalty to regularize
exposures. If user wants to tune the penalty, use the wrapper
function |
Rho_Z_Mu |
A scalar. This parameter is the LASSO penalty to
regularize cluster-specific means for omics data (Z). If user wants to tune the
penalty, use the wrapper function |
Rho_Z_Cov |
A scalar. This parameter is the graphical LASSO
penalty to estimate sparse cluster-specific variance-covariance matrices for omics
data (Z). If user wants to tune the penalty, use the wrapper function |
family |
The distribution of the outcome |
seed |
Random seed to initialize the EM algorithm |
init_impute |
Method to initialize the imputation of missing values in
LUCID. |
init_par |
For "early", an interface to initialize EM algorithm, if mclust,
initiate the parameters using the |
verbose |
A flag indicates whether detailed information for each iteration of EM algorithm is printed in console. Default is FALSE. |
Value
A list contains the object below:
res_Beta: estimation for G->X associations
res_Mu: estimation for the mu of the X->Z associations
res_Sigma: estimation for the sigma of the X->Z associations
res_Gamma: estimation for X->Y associations
inclusion.p: inclusion probability of cluster assignment for each observation
K: umber of latent clusters for "early"/list of numbers of latent clusters for "parallel" and "serial"
var.names: names for the G, Z, Y variables
init_omic.data.model: pre-specified geometric model of multi-omics data
likelihood: converged LUCID model log likelihood
family: the distribution of the outcome
select: for LUCID early integration only, indicators of whether each exposure and omics feature is selected
useY: whether this LUCID model is supervised
Z: multi-omics data
init_impute: pre-specified imputation method
init_par: pre-specified parameter initialization method
Rho: for LUCID early integration only, pre-specified regularity tuning parameter
N: number of observations
submodel: for LUCID in serial only, storing all the submodels
Examples
i <- 1008
set.seed(i)
G <- matrix(rnorm(500), nrow = 100)
Z1 <- matrix(rnorm(1000),nrow = 100)
Z2 <- matrix(rnorm(1000), nrow = 100)
Z3 <- matrix(rnorm(1000), nrow = 100)
Z4 <- matrix(rnorm(1000), nrow = 100)
Z5 <- matrix(rnorm(1000), nrow = 100)
Z <- list(Z1 = Z1, Z2 = Z2, Z3 = Z3, Z4 = Z4, Z5 = Z5)
Y <- rnorm(100)
CoY <- matrix(rnorm(200), nrow = 100)
CoG <- matrix(rnorm(200), nrow = 100)
fit1 <- estimate_lucid(G = G, Z = Z, Y = Y, K = list(2,2,2,2,2),
lucid_model = "serial",
family = "normal",
seed = i,
CoG = CoG, CoY = CoY,
useY = TRUE)