R: multi-block PLSDA models

mbplsrda {rchemo}

R Documentation

multi-block PLSDA models

Description

Multi-block discrimination (DA) based on PLS.

The training variable y (univariate class membership) is firstly transformed to a dummy table containing nclas columns, where nclas is the number of classes present in y. Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the X-data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.

- mbplsrda: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.

- mbplslda and mbplsqda: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.

Usage


mbplsrda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, 
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

mbplslda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"),
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

mbplsqda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"),
Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1])

## S3 method for class 'Mbplsrda'
predict(object, X, ..., nlv = NULL) 

## S3 method for class 'Mbplsprobda'
predict(object, X, ..., nlv = NULL)

Arguments

`Xlist`	For the main functions: list of training X-data (`n`rows).
`X`	For the auxiliary functions: list of new X-data (`n` rows), with the same variables than the training X-data.
`y`	Training class membership (`n`). Note: If `y` is a factor, it is replaced by a character vector.
`blockscaling`	logical. If TRUE, the scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block.
`weights`	Weights (`n`) to apply to the training observations for the PLS2. Internally, weights are "normalized" to sum to 1. Default to `NULL` (weights are set to `1 / n`).
`nlv`	The number(s) of LVs to calculate.
`prior`	The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in `y`).
`Xscaling`	vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
`Yscaling`	character. variable scaling for the Y-block after binary transformation, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.
`object`	For the auxiliary functions: A fitted model, output of a call to the main functions.
`...`	For the auxiliary functions: Optional arguments. Not used.

Value

For mbplsrda:

`fm`	list with the MB-PLS model: (`T`): X-scores matrix; (`P`): X-loading matrix;(`R`): The PLS projection matrix (p,nlv); (`W`): X-loading weights matrix ;(`C`): The Y-loading weights matrix; (`TT`): the X-score normalization factor; (`xmeans`): the centering vector of X (p,1); (`ymeans`): the centering vector of Y (q,1); (`weights`): vector of observation weights; (`blockscaling`): block scaling; (`Xnorms`): "norm" of each block; (`U`): intermediate output.
`lev`	classes
`ni`	number of observations in each class

For mbplslda, mbplsqda:

`fm`	list with [[1]] the MB-PLS model: (`T`): X-scores matrix; (`P`): X-loading matrix;(`R`): The PLS projection matrix (p,nlv); (`W`): X-loading weights matrix ;(`C`): The Y-loading weights matrix; (`TT`): the X-score normalization factor; (`xmeans`): the centering vectors of X; (`ymeans`): the centering vector of Y (q,1); (`xscales`): the scaling vector of X (p,1); (`yscales`): the scaling vector of Y (q,1); (`weights`): vector of observation weights; (`blockscaling`): block scaling; (`Xnorms`): "norm" of each block; (`U`): intermediate output. [[2]] lda or qda models.
`lev`	classes
`ni`	number of observations in each class

For predict.Mbplsrda, predict.Mbplsprobda:

`pred`	predicted class for each observation
`posterior`	calculated probability of belonging to a class for each observation

Note

The first example concerns MB-PLSDA, and the second one concerns MB-PLS LDA. fm are PLS1 models, and zfm are PLS2 models.

Examples


## EXAMPLE OF MB-PLSDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplsrda(Xtrainlist, ytrain, Xscaling = "sd", nlv = nlv)
names(fm)

predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 0:2)$pred

pred <- predict(fm, Xtestlist)$pred
err(pred, ytest)

zfm <- fm$fm
transform(zfm, Xtestlist)
transform(zfm, Xtestlist, nlv = 1)
summary(zfm, Xtrainlist)
coef(zfm)
coef(zfm, nlv = 0)
coef(zfm, nlv = 2)

## EXAMPLE OF MB-PLS LDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplslda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv)
predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 1:2)$pred

zfm <- fm[[1]][[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrainlist)
transform(zfm, Xtestlist)
coef(zfm)

## EXAMPLE OF MB-PLS QDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8])

ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]
Xtestlist <- list(Xtest[,1:3], Xtest[,4:8])

nlv <- 5
fm <- mbplsqda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv)
predict(fm, Xtestlist)
predict(fm, Xtestlist, nlv = 1:2)$pred

zfm <- fm[[1]][[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrainlist)
transform(zfm, Xtestlist)
coef(zfm)

[Package rchemo version 0.1-2 Index]