lspls-package {lspls} | R Documentation |
LS-PLS Models
Description
Implements the LS-PLS (least squares - partial least squares) method described in for instance Jørgensen, K., Segtnan, V. H., Thyholt, K., Næs, T. (2004) "A Comparison of Methods for Analysing Regression Models with Both Spectral and Designed Variables" Journal of Chemometrics, 18(10), 451–464, <doi:10.1002/cem.890>.
Details
The DESCRIPTION file:
Package: | lspls |
Title: | LS-PLS Models |
Version: | 0.2-2 |
Date: | 2018-07-26 |
Authors@R: | c(person("Bjørn-Helge", "Mevik", role = c("aut", "cre"), email = "b-h@mevik.net")) |
Author: | Bjørn-Helge Mevik [aut, cre] |
Maintainer: | Bjørn-Helge Mevik <b-h@mevik.net> |
Encoding: | UTF-8 |
Depends: | pls (>= 2.2.0) |
Imports: | grDevices, graphics, methods, stats |
Description: | Implements the LS-PLS (least squares - partial least squares) method described in for instance Jørgensen, K., Segtnan, V. H., Thyholt, K., Næs, T. (2004) "A Comparison of Methods for Analysing Regression Models with Both Spectral and Designed Variables" Journal of Chemometrics, 18(10), 451--464, <doi:10.1002/cem.890>. |
License: | GPL-2 |
URL: | http://mevik.net/work/software/lspls.html, https://github.com/bhmevik/lspls |
BugReports: | https://github.com/bhmevik/lspls/issues |
Index of help topics:
MSEP.lsplsCv MSEP, RMSEP and R^2 for LS-PLS lspls Fit LS-PLS Models lspls-package LS-PLS Models lsplsCv Cross-Validate LS-PLS Models orthlspls.fit Underlying LS-PLS Fit Function orthlsplsCv Low Level Cross-Validation Function plot.lspls Plots of LS-PLS Models plot.lsplsCv Plot Method for Cross-Validations predict.lspls Predict Method for LS-PLS Models project Projection and Orthogonalisation
LS-PLS (least squares–partial least squares) models are written on the form
Y = X\beta + T_1\gamma_1 + \cdots + T_k\gamma_k + E,
where the terms T_i
are one or more matrices Z_{i,j}
separated by a colon (:), i.e., Z_{i,1} \colon Z_{i,2}\colon \cdots
\colon Z_{i,l_i}
. Multi-response models are
possible, in wich case Y
should be a matrix.
The model is fitted from left to right. First Y
is fitted to
X
using least squares (LS) regression and the residuals
calculated.
For each i
, the matrices Z_{i,1}
, ..., Z_{i,l_i}
are orthogonalised against the variables used in the regression sofar
(when i = 1
, this means X
).
The residuals from the LS regression are used as the response in PLS
regressions with the orthogonalised matrices as predictors (one PLS
regression for each matrix), and the desired number of PLS components
from each matrix are included among the LS prediction variables.
The LS regression is then refit with the new variables, and new
residuals calculated.
The function to fit LS-PLS models is lspls
. A typical
usage to fit the model
y = X\beta + Z \gamma + V_1 \colon V_2 \eta
+ W \theta + E
would be
mod <- lspls(y ~ X + Z + V1:V2 + W, ncomp = list(3, c(2,1), 2), data = mydata)
The first argument is the formula describing the model.
X
is fit first, using LS. Then PLS scores from Z
(orthogonalised) are added. Then PLS scores from V1
and
V2
are added (simultaneously), and finally PLS scores from
W
. The next argument, ncomp
, specifies the number of
components to use from each PLS: 3 Z
score vectors, 2 V1
score vectors, 1 V2
score vector and 2 W
score vectors.
Finally, mydata
should be a data frame with matrices y
,
X
, Z
, V1
, V2
and W
(for
single-response models, y
can be a vector).
Currently, score plots and loading plots of fitted models are
implemented. plot(mod, "scores")
gives score plots for each PLS
regression, and plot(mod, "loadings")
gives loading plots.
There is a predict
method to predict response or score values
from new data
predict(mod, newdata = mynewdata)
(This predicts response values. Use type = "scores"
to get
scores.) Also, the standard functions resid
and fitted
can be used to extract the residuals and fitted values.
In order to determine the number of components to use from each matrix, one can use cross-validation:
cvmod <- lsplsCv(y ~ X + Z + V1:V2 + W, ncomp = list(4, c(3,4), 3), segments = 12, data = mydata)
In lsplsCv
, ncomp
gives the maximal number of components to
test. The argument segments
specifies the number of segments to
use. One can specify the type of segments to use (random (default),
consequtive or interleaved) with the argument segment.type
.
Alternatively, one can supply the segments explicitly with
segments
. See lsplsCv
for details.
One can plot cross-validated RMSEP values with plot(cvmod)
.
(Similarly, plot(cvmod, "MSEP")
plots MSEP values.) This makes
it easier to determine the optimal number of components for each PLS.
See plot.lsplsCv
for details. To calculate the RMSEP or
MSEP values explicitly, one can use the function RMSEP
or
MSEP
.
Author(s)
Bjørn-Helge Mevik [aut, cre]
Maintainer: Bjørn-Helge Mevik <b-h@mevik.net>
References
Jørgensen, K., Segtnan, V. H., Thyholt, K., Næs, T. (2004) A Comparison of Methods for Analysing Regression Models with Both Spectral and Designed Variables. Journal of Chemometrics, 18(10), 451–464.
Jørgensen, K., Mevik, B.-H., Næs, T. Combining Designed Experiments with Several Blocks of Spectroscopic Data. (Submitted)
Mevik, B.-H., Jørgensen, K., Måge, I., Næs, T. LS-PLS: Combining Categorical Design Variables with Blocks of Spectroscopic Measurements. (Submitted)
See Also
lspls
, lsplsCv
, plot.lspls
,
plot.lsplsCv
Examples
## FIXME