lwplsrda_agg {rchemo} | R Documentation |
Aggregation of KNN-LWPLSDA models with different numbers of LVs
Description
Ensemblist method where the predictions are calculated by "averaging" the predictions of KNN-LWPLSDA models built with different numbers of latent variables (LVs).
For instance, if argument nlv
is set to nlv = "5:10"
, the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs, respectively.
- lwplsrda_agg
: use plsrda
.
- lwplslda_agg
: use plslda
.
- lwplsqda_agg
: use plsqda
.
Usage
lwplsrda_agg(
X, y,
nlvdis, diss = c("eucl", "mahal"),
h, k,
nlv,
cri = 4,
verb = FALSE
)
lwplslda_agg(
X, y,
nlvdis, diss = c("eucl", "mahal"),
h, k,
nlv,
prior = c("unif", "prop"),
cri = 4,
verb = FALSE
)
lwplsqda_agg(
X, y,
nlvdis, diss = c("eucl", "mahal"),
h, k,
nlv,
prior = c("unif", "prop"),
cri = 4,
verb = FALSE
)
## S3 method for class 'Lwplsrda_agg'
predict(object, X, ...)
## S3 method for class 'Lwplsprobda_agg'
predict(object, X, ...)
Arguments
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
nlv |
A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs). |
prior |
For |
cri |
Argument |
verb |
Logical. If |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. Not used. |
Value
For lwplsrda_agg
, lwplslda_agg
and lwplsqda_agg
: object of class lwplsrda_agg
, lwplslda_agg
or lwplsqda_agg
For predict.Lwplsrda_agg
and predict.Lwplsprobda_agg
:
pred |
prediction calculated for each observation, which is the most occurent level (vote) over the predictions returned by the models with different numbers of LVS respectively |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
Note
The first example concerns KNN-LWPLSRDA-AGG. The second example concerns KNN-LWPLSLDA-AGG.
Examples
## KNN-LWPLSRDA-AGG
n <- 40 ; p <- 7
X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
y <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtrain <- X ; ytrain <- y
m <- 5
Xtest <- X[1:m, ] ; ytest <- y[1:m]
nlvdis <- 5 ; diss <- "mahal"
h <- 2 ; k <- 10
nlv <- "2:4"
fm <- lwplsrda_agg(
Xtrain, ytrain,
nlvdis = nlvdis, diss = diss,
h = h, k = k,
nlv = nlv)
res <- predict(fm, Xtest)
res$pred
res$listnn
nlvdis <- 5 ; diss <- "mahal"
h <- c(2, Inf)
k <- c(10, 15)
nlv <- c("1:3", "2:4")
pars <- mpars(nlvdis = nlvdis, diss = diss,
h = h, k = k, nlv = nlv)
pars
res <- gridscore(
Xtrain, ytrain, Xtest, ytest,
score = err,
fun = lwplsrda_agg,
pars = pars)
res
segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
Xtrain, ytrain,
segm, score = err,
fun = lwplsrda_agg,
pars = pars,
verb = TRUE)
names(res)
res$val
## KNN-LWPLSLDA-AGG
n <- 40 ; p <- 7
X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
y <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtrain <- X ; ytrain <- y
m <- 5
Xtest <- X[1:m, ] ; ytest <- y[1:m]
nlvdis <- 5 ; diss <- "mahal"
h <- 2 ; k <- 10
nlv <- "2:4"
fm <- lwplslda_agg(
Xtrain, ytrain,
nlvdis = nlvdis, diss = diss,
h = h, k = k,
nlv = nlv, prior = "prop")
res <- predict(fm, Xtest)
res$pred
res$listnn
nlvdis <- 5 ; diss <- "mahal"
h <- c(2, Inf)
k <- c(10, 15)
nlv <- c("1:3", "2:4")
pars <- mpars(nlvdis = nlvdis, diss = diss,
h = h, k = k, nlv = nlv,
prior = c("unif", "prop"))
pars
res <- gridscore(
Xtrain, ytrain, Xtest, ytest,
score = err,
fun = lwplslda_agg,
pars = pars)
res
segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
Xtrain, ytrain,
segm, score = err,
fun = lwplslda_agg,
pars = pars,
verb = TRUE)
names(res)
res$val