msda {TULIP} | R Documentation |
Fits a regularization path of Sparse Discriminant Analysis and predicts
Description
Fits a regularization path of Sparse Discriminant Analysis at a sequence of regularization parameters lambda. Performs prediction when testing data is provided. The msda
function solves classification problem by fitting a sparse discriminant analysis model. When covariates are provided, the function will first make adjustment on the training data. It provides three models: binary
for fitting DSDA model to solve binary classification problems, multi.original
and multi.modified
for fitting MSDA model to solve multi-class classification problems. multi.original
runs faster for small dimension case but the computation ability is limited to a relatively large dimension. multi.modified
has no such limitation and works in ultra-high dimensions. User can specify method by argument or use the default settings.
Usage
msda(x, z=NULL, y, testx=NULL,testz=NULL, model = NULL, lambda = NULL,
standardize=FALSE, alpha=1, nlambda = 100,
lambda.factor = ifelse((nobs - nclass)<= nvars, 0.2, 1e-03), dfmax = nobs,
pmax = min(dfmax * 2 + 20, nvars), pf = rep(1, nvars), eps = 1e-04,
maxit = 1e+06, sml = 1e-06, verbose = FALSE, perturb = NULL)
Arguments
x |
Input matrix of predictors. |
z |
Input covariate matrix of dimension |
y |
Class labl. This argument should be a factor for classification. For |
testx |
Input testing matrix. Each row is a test case. When |
testz |
Input testing covariate matrix. Can be omitted if covariate is absent. However, training covariates |
model |
Method type. The |
lambda |
A user supplied |
standardize |
A logic object indicating whether x should be standardized before performing DSDA. Default is FALSE. This argument is only valid for |
alpha |
The elasticnet mixing parameter, the same as in glmnet. Default is alpha=1 so that the lasso penalty is used in DSDA. This argument is only valid for |
nlambda |
The number of tuning values in sequence |
lambda.factor |
The factor for getting the minimal lambda in |
dfmax |
The maximum number of selected variables in the model. Default is the number of observations |
pmax |
The maximum number of potential selected variables during iteration. In middle step, the algorithm can select at most |
pf |
L1 penalty factor of length |
eps |
Convergence threshold for coordinate descent. Each inner
coordinate descent loop continues until the relative change in any
coefficient. Defaults value is |
maxit |
Maximum number of outer-loop iterations allowed at fixed lambda value. Default is 1e6. If models do not converge, consider increasing |
sml |
Threshold for ratio of loss function change after each iteration to old loss function value. Default is |
verbose |
Whether to print out computation progress. The default is |
perturb |
A scalar number. If it is specified, the number will be added to each diagonal element of the covariance matrix as perturbation. The default is |
Details
The msda
function fits a linear discriminant analysis model for vector X
as follows:
\mathbf{X}|Y=k\sim N(\boldsymbol{\mu}_k,\boldsymbol{\Sigma}).
The categorical response is predicted from the Bayes rule:
\widehat{Y}=\arg\max_{k=1,\cdots,K}{(\mathbf{X}-\frac{\boldsymbol{\mu}_k}{2})^T\boldsymbol{\beta}_k+\log\pi_k}.
The parameter model
specifies which method to use in estimating \boldsymbol{\beta}
. Users can use binary
for binary problems and binary
and multi.modified
for multi-class problems. In multi.original
, the algorithm first computes and stores \boldsymbol{\Sigma}
, while it doesn't compute or store the entire covariance matrix in multi.modified
. Since the algorithm is element-wise based, multi.modified
computes each element of covariance matrix when needed. Therefore, multi.original
is faster for low dimension but multi.modified
can fit model for a much higher dimension case.
Note that for computing speed reason, if models are not converging or running slow, consider increasing eps
and sml
, or decreasing
nlambda
, or increasing lambda.factor
before increasing
maxit
. Users can also reduce dfmax
to limit the maximum number of variables in the model.
The arguments list out all parameters in the three models, but not all of them are necessary in applying one of the methods. See the specific explaination of each argument for more detail. Meanwhile, the output of DSDA model only includes beta
and lambda
.
Value
An object with S3 class dsda
or msda.original
and msda.modified
.
beta |
Output variable coefficients for each |
df |
The number of nonzero coefficients for each value of |
obj |
The fitted value of the objective function for each value of |
dim |
Dimension of each coefficient matrix. |
lambda |
The actual |
x |
The input matrix of predictors for training. |
y |
Class label in training data. |
npasses |
Total number of iterations (the most inner loop) summed over all lambda values |
jerr |
Error flag, for warnings and errors, 0 if no error. |
sigma |
Estimated sigma matrix. This argument is only available in object |
delta |
Estimated delta matrix. delta[k] = mu[k]-mu[1]. |
mu |
Estimated mu vector. |
prior |
Prior probability that y belong to class k, estimated by mean(y that belong to k). |
call |
The call that produced this object |
pred |
Predicted categorical response for each value in sequence |
Author(s)
Yuqing Pan, Qing Mai, Xin Zhang
References
Mai, Q., Zou, H. and Yuan, M. (2012), "A direct approach to sparse discriminant analysis in ultra-high dimensions." Biometrica, 99, 29-42.
Mai, Q., Yang, Y., and Zou, H. (2017), "Multiclass sparse discriminant analysis." Statistica Sinica, in press.
URL: https://github.com/emeryyi/msda
See Also
Examples
data(GDS1615)
x<-GDS1615$x
y<-GDS1615$y
obj <- msda(x = x, y = y)