mbplsrda {rchemo}R Documentation

multi-block PLSDA models

Description

Multi-block discrimination (DA) based on PLS.

The training variable y (univariate class membership) is firstly transformed to a dummy table containing nclas columns, where nclas is the number of classes present in y. Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the X-data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.

- mbplsrda: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.

- mbplslda and mbplsqda: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.

Usage


mbplsrda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, 
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

mbplslda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"),
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

mbplsqda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"),
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

## S3 method for class 'Mbplsrda'
predict(object, X, ..., nlv = NULL) 

## S3 method for class 'Mbplsprobda'
predict(object, X, ..., nlv = NULL) 

Arguments

Xlist

For the main functions: list of training X-data (nrows).

X

For the auxiliary functions: list of new X-data (n rows), with the same variables than the training X-data.

y

Training class membership (n). Note: If y is a factor, it is replaced by a character vector.

blockscaling

logical. If TRUE, the scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block.

weights

Weights (n) to apply to the training observations for the PLS2. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to 1 / n).

nlv

The number(s) of LVs to calculate.

prior

The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).

Xscaling

vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

Yscaling

character. variable scaling for the Y-block after binary transformation, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

object

For the auxiliary functions: A fitted model, output of a call to the main functions.

...

For the auxiliary functions: Optional arguments. Not used.

Value

For mbplsrda:

fm

list with the MB-PLS model: (T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (weights): vector of observation weights; (blockscaling): block scaling; (Xnorms): "norm" of each block; (U): intermediate output.

lev

classes

ni

number of observations in each class

For mbplslda, mbplsqda:

fm

list with [[1]] the MB-PLS model: (T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vectors of X; (ymeans): the centering vector of Y (q,1); (xscales): the scaling vector of X (p,1); (yscales): the scaling vector of Y (q,1); (weights): vector of observation weights; (blockscaling): block scaling; (Xnorms): "norm" of each block; (U): intermediate output. [[2]] lda or qda models.

lev

classes

ni

number of observations in each class

For predict.Mbplsrda, predict.Mbplsprobda:

pred

predicted class for each observation

posterior

calculated probability of belonging to a class for each observation

Note

The first example concerns MB-PLSDA, and the second one concerns MB-PLS LDA. fm are PLS1 models, and zfm are PLS2 models.

Examples


## EXAMPLE OF MB-PLSDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplsrda(Xtrainlist, ytrain, Xscaling = "sd", nlv = nlv)
names(fm)

predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 0:2)$pred

pred <- predict(fm, Xtestlist)$pred
err(pred, ytest)

zfm <- fm$fm
transform(zfm, Xtestlist)
transform(zfm, Xtestlist, nlv = 1)
summary(zfm, Xtrainlist)
coef(zfm)
coef(zfm, nlv = 0)
coef(zfm, nlv = 2)

## EXAMPLE OF MB-PLS LDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplslda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv)
predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 1:2)$pred

zfm <- fm[[1]][[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrainlist)
transform(zfm, Xtestlist)
coef(zfm)

## EXAMPLE OF MB-PLS QDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplsqda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv)
predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 1:2)$pred

zfm <- fm[[1]][[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrainlist)
transform(zfm, Xtestlist)
coef(zfm)


[Package rchemo version 0.1-2 Index]