R: PLSDA with aggregation of latent variables

plsrda_agg {rchemo}

R Documentation

PLSDA with aggregation of latent variables

Description

Ensemblist approach where the predictions are calculated by "averaging" the predictions of PLSDA models built with different numbers of latent variables (LVs).

For instance, if argument nlv is set to nlv = "5:10", the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs.

- plsrda_agg: use plsrda.

- plslda_agg: use plslda.

- plsqda_agg: use plsqda.

Usage


plsrda_agg(X, y, weights = NULL, nlv)

plslda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop"))

plsqda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop"))

## S3 method for class 'Plsda_agg'
predict(object, X, ...)

Arguments

`X`	For the main functions: Training X-data (`n, p`). — For the auxiliary function: New X-data (`m, p`) to consider.
`y`	Training class membership (`n`). Note: If `y` is a factor, it is replaced by a character vector.
`weights`	Weights (`n, 1`) to apply to the training observations. Internally, weights are "normalized" to sum to 1. Default to `NULL` (weights are set to `1 / n`).
`nlv`	A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs).
`prior`	The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in `y`).
`object`	For the auxiliary function: A fitted model, output of a call to the main functions.
`...`	For the auxiliary function: Optional arguments. Not used.

Value

For plsrda_agg, plslda_agg and plsqda_agg:

fm

list contaning: the model((fm)=(T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (weights): vector of observation weights; (U): intermediate output), (lev):classes, (ni):number of observations in each class

nlv

range of the numbers of LVs considered

For predict.Plsda_agg:

`pred`	Final predictions (after aggregation)
`predlv`	Intermediate predictions (Per nb. LVs)

Note

the first example concerns PLSRDA-AGG, and the second one concerns PLSLDA-AGG.

Examples


## EXAMPLE OF PLSRDA-AGG

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE)

m <- 5
Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m]

nlv <- "2:5"
fm <- plsrda_agg(Xtrain, ytrain, nlv = nlv)
names(fm)
res <- predict(fm, Xtest)
names(res)
res$pred
err(res$pred, ytest)
res$predlv

pars <- mpars(nlv = c("1:3", "2:5"))
pars

res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plsrda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = plslda_agg, 
    pars = pars,
    verb = TRUE)
res

## EXAMPLE OF PLSLDA-AGG

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE)
#ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE)
m <- 5
Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m]

nlv <- "2:5"
fm <- plslda_agg(Xtrain, ytrain, nlv = nlv, prior = "unif")
names(fm)
res <- predict(fm, Xtest)
names(res)
res$pred
err(res$pred, ytest)
res$predlv

pars <- mpars(nlv = c("1:3", "2:5"), prior = c("unif", "prop"))
pars
res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plslda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = plslda_agg, 
    pars = pars,
    verb = TRUE)
res

[Package rchemo version 0.1-2 Index]