missoNet {missoNet} | R Documentation |
Fit a series of missoNet models with user-supplied regularization parameters for the lasso penalties
Description
This function fits the conditional graphical lasso models to datasets with missing response values.
‘missoNet
’ computes the regularization path for the lasso penalties sequentially along the
bivariate regularization parameter sequence \{(\lambda_B, \lambda_\Theta)\}
provided by the user.
Usage
missoNet(
X,
Y,
lambda.Beta,
lambda.Theta,
rho = NULL,
Beta.maxit = 10000,
Beta.thr = 1e-08,
eta = 0.8,
Theta.maxit = 10000,
Theta.thr = 1e-08,
eps = 1e-08,
penalize.diagonal = TRUE,
diag.penalty.factor = NULL,
standardize = TRUE,
standardize.response = TRUE,
fit.relax = FALSE,
parallel = FALSE,
cl = NULL,
verbose = 1
)
Arguments
X |
Numeric predictor matrix ( |
Y |
Numeric response matrix ( |
lambda.Beta |
A scalar or a numeric vector: a user-supplied sequence of non-negative value(s) for { |
lambda.Theta |
A scalar or a numeric vector: a user-supplied sequence of non-negative value(s) for { |
rho |
(Optional) A scalar or a numeric vector of length |
Beta.maxit |
The maximum number of iterations of the fast iterative shrinkage-thresholding algorithm (FISTA) for updating |
Beta.thr |
The convergence threshold of the FISTA algorithm for updating |
eta |
The backtracking line search shrinkage factor; the default is |
Theta.maxit |
The maximum number of iterations of the ‘ |
Theta.thr |
The convergence threshold of the ‘ |
eps |
A numeric tolerance level for the L1 projection of the empirical covariance matrix; the default is |
penalize.diagonal |
Logical: should the diagonal elements of |
diag.penalty.factor |
Numeric: a separate penalty multiplication factor for the diagonal elements of |
standardize |
Logical: should the columns of |
standardize.response |
Logical: should the columns of |
fit.relax |
Logical: the default is |
parallel |
Logical: the default is |
cl |
A cluster object created by ‘ |
verbose |
Value of |
Details
‘missoNet
’ is the main model-fitting function which is specifically proposed to fit the conditional
graphical lasso models / penalized multi-task Gaussian regressions to (corrupted) datasets with response values missing at random (MAR).
To facilitate the interpretation of the model, let's temporarily assume that there are no missing values
in the data used to fit the model. Suppose we have n
observations of both a p
-variate predictor X \in \mathcal{R}^p
and a q
-variate response Y \in \mathcal{R}^q
, for the i
th sample (i = 1,...,n
),
‘missoNet
’ assumes the model
Y_i = \mu + X_i\mathbf{B} + E_i,\ \ E_i \sim \mathcal{MVN}(0_q, (\mathbf{\Theta})^{-1}),
where Y_i \in \mathcal{R}^{1\times q}
and X_i \in \mathcal{R}^{1\times p}
are one
realization of the q
responses and the p
predictors, respectively.
E_i \in \mathcal{R}^{1\times q}
is an error vector drawn from a multivariate Gaussian distribution.
The regression coefficient matrix \mathbf{B} \in \mathcal{R}^{p\times q}
that mapping predictors to responses and
the precision (inverse covariance) matrix \mathbf{\Theta} \in \mathcal{R}^{q\times q}
that revealing the
responses' conditional dependencies are the parameters to be estimated by solving a penalized MLE problem
(\hat{\mathbf{\Theta}},\hat{\mathbf{B}}) = {\mathrm{argmin}}_{\mathbf{\Theta} \succeq 0,\ \mathbf{B}}\
g(\mathbf{\Theta},\mathbf{B}) + \lambda_{\Theta}(\|\mathbf{\Theta}\|_{1,\mathrm{off}} + 1_{n\leq \mathrm{max}(p,q)} \|\mathbf{\Theta}\|_{1,\mathrm{diag}}) + \lambda_{B}\|\mathbf{B}\|_1,
where
g(\mathbf{\Theta},\mathbf{B}) = \mathrm{tr}\left[\frac{1}{n}(\mathbf{Y} - \mathbf{XB})^\top(\mathbf{Y} - \mathbf{XB}) \mathbf{\Theta}\right]
- \mathrm{log}|\mathbf{\Theta}|.
The response matrix \mathbf{Y} \in \mathcal{R}^{n\times q}
has i
th row (Y_i - \frac{1}{n}\sum_{j=1}^n Y_j
),
and the predictor matrix \mathbf{X} \in \mathcal{R}^{n\times p}
has i
th row (X_i - \frac{1}{n}\sum_{j=1}^n X_j
).
The intercept \mu \in \mathcal{R}^{1\times q}
is canceled out because of centering of the data matrices \mathbf{Y}
and \mathbf{X}
.
1_{n\leq \mathrm{max}(p,q)}
denotes the indicator function for whether penalizing the diagonal elements of \mathbf{\Theta}
or not.
When n\leq \mathrm{max}(p,q)
, a global minimizer of the objective function defined above does not exist without the diagonal penalization.
Missingness in real data is inevitable. In this instance, the estimates based only on complete cases are likely to be biased,
and the objective function is likely to no longer be a biconvex optimization problem. In addition, many algorithms cannot be directly employed since they
require complete datasets as inputs. ‘missoNet
’ aims to handle the specific situation where the response matrix \mathbf{Y}
contains values that
are missing at random (MAR. Please refer to the vignette or other resources for more information about the differences between MAR, missing completely at
random (MCAR) and missing not at random (MNAR)). As it should be, ‘missoNet
’ is also applicable to datasets with MCAR response values or without any missing values.
The method provides a unified framework for automatically solving a convex modification of the multi-task learning problem defined above,
using corrupted datasets. Moreover, ‘missoNet
’ enjoys the theoretical and computational benefits of convexity and returns
solutions that are comparable/close to the clean conditional graphical lasso estimates. Please refer to the original manuscript (coming soon) for more details of our method.
Value
This function returns a 'list'
consisting of the following components:
est.list |
A named
|
rho |
A vector of length |
penalize.diagonal |
Logical: whether the diagonal elements of |
diag.penalty.factor |
The additional penalty multiplication factor for the diagonal elements of |
Author(s)
Yixiao Zeng yixiao.zeng@mail.mcgill.ca, Celia M.T. Greenwood and Archer Yi Yang.
Examples
## Simulate a dataset with response values missing completely at random (MCAR),
## the overall missing rate is around 10%.
set.seed(123) # reproducibility
sim.dat <- generateData(n = 300, p = 50, q = 20, rho = 0.1, missing.type = "MCAR")
tr <- 1:240 # training set indices
tst <- 241:300 # test set indices
X.tr <- sim.dat$X[tr, ] # predictor matrix
Y.tr <- sim.dat$Z[tr, ] # corrupted response matrix
## Fit one missoNet model with two scalars for 'lambda.Beta' and 'lambda.Theta'.
fit1 <- missoNet(X = X.tr, Y = Y.tr, lambda.Beta = 0.1, lambda.Theta = 0.2)
## Fit a series of missoNet models with the lambda pairs := (lambda.Beta, lambda.Theta)
## sequentially extracted from the 'lambda.Beta' and 'lambda.Theta' vectors, note that the
## two vectors must have the same length.
lamB.vec <- 10^(seq(from = 0, to = -1, length.out = 5))
lamTht.vec <- rep(0.1, 5)
fit2 <- missoNet(X = X.tr, Y = Y.tr, lambda.Beta = lamB.vec, lambda.Theta = lamTht.vec)
## Parallelization on a cluster with two cores.
cl <- parallel::makeCluster(2)
fit2 <- missoNet(X = X.tr, Y = Y.tr, lambda.Beta = lamB.vec, lambda.Theta = lamTht.vec,
parallel = TRUE, cl = cl)
parallel::stopCluster(cl)
## Extract the estimates at ('lamB.vec[1]', 'lamTht.vec[1]').
## The estimates at the subsequent lambda pairs could be accessed in the same way.
Beta.hat <- fit2$est.list[[1]]$Beta
Theta.hat <- fit2$est.list[[1]]$Theta
lambda.Beta <- fit2$est.list[[1]]$lambda.Beta # equal to 'lamB.vec[1]'
lambda.Theta <- fit2$est.list[[1]]$lambda.Theta # equal to 'lamTht.vec[1]'
## Fit a series of missoNet models using PRE-STANDARDIZED training data
## if you wish to compare the results with other softwares.
X.tr.std <- scale(X.tr, center = TRUE, scale = TRUE)
Y.tr.std <- scale(Y.tr, center = TRUE, scale = TRUE)
fit3 <- missoNet(X = X.tr.std, Y = Y.tr.std, lambda.Beta = lamB.vec, lambda.Theta = lamTht.vec,
standardize = FALSE, standardize.response = FALSE)