R: Classification procedure for multi-label response based on a...

multinom.spls {plsgenomics}

R Documentation

Classification procedure for multi-label response based on a multinomial model, solved by a combination of the multinomial Ridge Iteratively Reweighted Least Squares (multinom-RIRLS) algorithm and the Adaptive Sparse PLS (SPLS) regression

Description

The function multinom.spls performs compression and variable selection in the context of multi-label ('nclass' > 2) classification (with possible prediction) using Durif et al. (2018) algorithm based on Ridge IRLS and sparse PLS.

Usage

multinom.spls(
  Xtrain,
  Ytrain,
  lambda.ridge,
  lambda.l1,
  ncomp,
  Xtest = NULL,
  adapt = TRUE,
  maxIter = 100,
  svd.decompose = TRUE,
  center.X = TRUE,
  scale.X = FALSE,
  weighted.center = TRUE
)

Arguments

`Xtrain`	a (ntrain x p) data matrix of predictor values. `Xtrain` must be a matrix. Each row corresponds to an observation and each column to a predictor variable.
`Ytrain`	a (ntrain) vector of (continuous) responses. `Ytrain` must be a vector or a one column matrix, and contains the response variable for each observation. `Ytrain` should take values in {0,...,nclass-1}, where nclass is the number of class.
`lambda.ridge`	a positive real value. `lambda.ridge` is the Ridge regularization parameter for the RIRLS algorithm (see details).
`lambda.l1`	a positive real value, in [0,1]. `lambda.l1` is the sparse penalty parameter for the dimension reduction step by sparse PLS (see details).
`ncomp`	a positive integer. `ncomp` is the number of PLS components. If `ncomp=0`,then the Ridge regression is performed without any dimension reduction (no SPLS step).
`Xtest`	a (ntest x p) matrix containing the predictor values for the test data set. `Xtest` may also be a vector of length p (corresponding to only one test observation). Default value is NULL, meaning that no prediction is performed.
`adapt`	a boolean value, indicating whether the sparse PLS selection step sould be adaptive or not (see details).
`maxIter`	a positive integer. `maxIter` is the maximal number of iterations in the Newton-Raphson parts in the RIRLS algorithm (see details).
`svd.decompose`	a boolean parameter. `svd.decompose` indicates wether or not the predictor matrix `Xtrain` should be decomposed by SVD (singular values decomposition) for the RIRLS step (see details).
`center.X`	a boolean value indicating whether the data matrices `Xtrain` and `Xtest` (if provided) should be centered or not.
`scale.X`	a boolean value indicating whether the data matrices `Xtrain` and `Xtest` (if provided) should be scaled or not (`scale.X=TRUE` implies `center.X=TRUE`) in the spls step.
`weighted.center`	a boolean value indicating whether the centering should take into account the weighted l2 metric or not in the SPLS step.

Details

The columns of the data matrices Xtrain and Xtest may not be standardized, since standardizing can be performed by the function multinom.spls as a preliminary step.

The procedure described in Durif et al. (2018) is used to compute latent sparse components that are used in a multinomial regression model. In addition, when a matrix Xtest is supplied, the procedure predicts the response associated to these new values of the predictors.

Value

An object of class multinom.spls with the following attributes

`Coefficients`	a (p+1) x (nclass-1) matrix containing the linear coefficients associated to the predictors and intercept in the multinomial model explaining the response Y.
`hatY`	the (ntrain) vector containing the estimated response value on the train set `Xtrain`.
`hatYtest`	the (ntest) vector containing the predicted labels for the observations from `Xtest` (if provided).
`DeletedCol`	the vector containing the indexes of columns with null variance in `Xtrain` that were skipped in the procedure.
`A`	a list of size nclass-1 with predictors selected by the procedures for each set of coefficients in the multinomial model (i.e. indexes of the corresponding non null entries in each columns of `Coefficients`. Each elements of `A` is a subset of 1:p.
`A.full`	union of elements in A, corresponding to predictors selected in the full model.
`Anames`	Vector of selected predictor names, i.e. the names of the columns from `Xtrain` that are in `A.full`.
`converged`	a {0,1} value indicating whether the RIRLS algorithm did converge in less than `maxIter` iterations or not.
`X.score`	list of nclass-1 different (n x ncomp) matrices being the observations coordinates or scores in the new component basis produced for each class in the multinomial model by the SPLS step (sparse PLS), see Durif et al. (2018) for details.
`X.weight`	list of nclass-1 different (p x ncomp) matrices being the coefficients of predictors in each components produced for each class in the multinomial model by the sparse PLS, see Durif et al. (2018) for details.
`X.score.full`	a ((n x (nclass-1)) x ncomp) matrix being the observations coordinates or scores in the new component basis produced by the SPLS step (sparse PLS) in the linearized multinomial model, see Durif et al. (2018). Each column t.k of `X.score` is a SPLS component.
`X.weight.full`	a (p x ncomp) matrix being the coefficients of predictors in each components produced by sparse PLS in the linearized multinomial model, see Durif et al. (2018). Each column w.k of `X.weight` verifies t.k = Xtrain x w.k (as a matrix product).
`lambda.ridge`	the Ridge hyper-parameter used to fit the model.
`lambda.l1`	the sparse hyper-parameter used to fit the model.
`ncomp`	the number of components used to fit the model.
`V`	the (ntrain x ntrain) matrix used to weight the metric in the sparse PLS step. `V` is the inverse of the covariance matrix of the pseudo-response produced by the RIRLS step.
`proba`	the (ntrain) vector of estimated probabilities for the observations in code `Xtrain`, that are used to estimate the `hatY` labels.
`proba.test`	the (ntest) vector of predicted probabilities for the new observations in `Xtest`, that are used to predict the `hatYtest` labels.

Author(s)

Ghislain Durif (https://gdurif.perso.math.cnrs.fr/).

References

Durif, G., Modolo, L., Michaelsson, J., Mold, J.E., Lambert-Lacroix, S., Picard, F., 2018. High dimensional classification with combined adaptive sparse PLS and logistic regression. Bioinformatics 34, 485–493. doi:10.1093/bioinformatics/btx571. Available at http://arxiv.org/abs/1502.05933.

Examples

## Not run: 
### load plsgenomics library
library(plsgenomics)

### generating data
n <- 100
p <- 100
nclass <- 3
sample1 <- sample.multinom(n, p, nb.class=nclass, kstar=20, lstar=2, 
                           beta.min=0.25, beta.max=0.75, 
                           mean.H=0.2, sigma.H=10, sigma.F=5)
X <- sample1$X
Y <- sample1$Y

### splitting between learning and testing set
index.train <- sort(sample(1:n, size=round(0.7*n)))
index.test <- (1:n)[-index.train]

Xtrain <- X[index.train,]
Ytrain <- Y[index.train,]
Xtest <- X[index.test,]
Ytest <- Y[index.test,]

### fitting the model, and predicting new observations
model1 <- multinom.spls(Xtrain=Xtrain, Ytrain=Ytrain, lambda.ridge=2, 
                        lambda.l1=0.5, ncomp=2, Xtest=Xtest, adapt=TRUE, 
                        maxIter=100, svd.decompose=TRUE)
                     
str(model1)

### prediction error rate
sum(model1$hatYtest!=Ytest) / length(index.test)

## End(Not run)

[Package plsgenomics version 1.5-3 Index]