PLS {sharp} | R Documentation |
Partial Least Squares 'a la carte'
Description
Runs a Partial Least Squares (PLS) model in regression mode using algorithm
implemented in pls
. This function allows for the
construction of components based on different sets of predictor and/or
outcome variables. This function is not using stability.
Usage
PLS(
xdata,
ydata,
selectedX = NULL,
selectedY = NULL,
family = "gaussian",
ncomp = NULL,
scale = TRUE
)
Arguments
xdata |
matrix of predictors with observations as rows and variables as columns. |
ydata |
optional vector or matrix of outcome(s). If |
selectedX |
binary matrix of size |
selectedY |
binary matrix of size |
family |
type of PLS model. Only |
ncomp |
number of components. |
scale |
logical indicating if the data should be scaled (i.e. transformed so that all variables have a standard deviation of one). |
Details
All matrices are defined as in (Wold et al. 2001). The weight matrix
Wmat
is the equivalent of loadings$X
in
pls
. The loadings matrix Pmat
is the
equivalent of mat.c
in pls
. The score
matrices Tmat
and Qmat
are the equivalent of
variates$X
and variates$Y
in pls
.
Value
A list with:
Wmat |
matrix of X-weights. |
Wstar |
matrix of transformed X-weights. |
Pmat |
matrix of X-loadings. |
Cmat |
matrix of Y-weights. |
Tmat |
matrix of X-scores. |
Umat |
matrix of Y-scores. |
Qmat |
matrix needed for predictions. |
Rmat |
matrix needed for predictions. |
meansX |
vector used for centering of predictors, needed for predictions. |
sigmaX |
vector used for scaling of predictors, needed for predictions. |
meansY |
vector used for centering of outcomes, needed for predictions. |
sigmaY |
vector used for scaling of outcomes, needed for predictions. |
methods |
a list with |
params |
a list with
|
References
Wold S, Sjöström M, Eriksson L (2001). “PLS-regression: a basic tool of chemometrics.” Chemometrics and Intelligent Laboratory Systems, 58(2), 109-130. ISSN 0169-7439, doi:10.1016/S0169-7439(01)00155-1, PLS Methods.
See Also
VariableSelection
, BiSelection
Examples
if (requireNamespace("mixOmics", quietly = TRUE)) {
oldpar <- par(no.readonly = TRUE)
# Data simulation
set.seed(1)
simul <- SimulateRegression(n = 200, pk = 15, q = 3, family = "gaussian")
x <- simul$xdata
y <- simul$ydata
# PLS
mypls <- PLS(xdata = x, ydata = y, ncomp = 3)
if (requireNamespace("sgPLS", quietly = TRUE)) {
# Sparse PLS to identify relevant variables
stab <- BiSelection(
xdata = x, ydata = y,
family = "gaussian", ncomp = 3,
LambdaX = seq_len(ncol(x) - 1),
LambdaY = seq_len(ncol(y) - 1),
implementation = SparsePLS,
n_cat = 2
)
plot(stab)
# Refitting of PLS model
mypls <- PLS(
xdata = x, ydata = y,
selectedX = stab$selectedX,
selectedY = stab$selectedY
)
# Nonzero entries in weights are the same as in selectedX
par(mfrow = c(2, 2))
Heatmap(stab$selectedX,
legend = FALSE
)
title("Selected in X")
Heatmap(ifelse(mypls$Wmat != 0, yes = 1, no = 0),
legend = FALSE
)
title("Nonzero entries in Wmat")
Heatmap(stab$selectedY,
legend = FALSE
)
title("Selected in Y")
Heatmap(ifelse(mypls$Cmat != 0, yes = 1, no = 0),
legend = FALSE
)
title("Nonzero entries in Cmat")
}
# Multilevel PLS
# Generating random design
z <- rep(seq_len(50), each = 4)
# Extracting the within-variability
x_within <- mixOmics::withinVariation(X = x, design = cbind(z))
# Running PLS on within-variability
mypls <- PLS(xdata = x_within, ydata = y, ncomp = 3)
par(oldpar)
}