plsFit {mvdalab} | R Documentation |
Partial Least Squares Regression
Description
Functions to perform partial least squares regression with a formula interface. Bootstraping can be used. Prediction, residuals, model extraction, plot, print and summary methods are also implemented.
Usage
plsFit(formula, data, subset, ncomp = NULL, na.action,
method = c("bidiagpls", "wrtpls"), scale = TRUE, n_cores = 2,
alpha = 0.05, perms = 2000, validation = c("none", "oob", "loo"),
boots = 1000, model = TRUE, parallel = FALSE,
x = FALSE, y = FALSE, ...)
## S3 method for class 'mvdareg'
summary(object, ncomp = object$ncomp, digits = 3, ...)
Arguments
formula |
a model formula (see below). |
data |
an optional data frame containing the variables in the model. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
ncomp |
the number of components to include in the model (see below). |
na.action |
a function which indicates what should happen when the data contain |
method |
the multivariate regression algorithm to be used. |
scale |
should scaling to unit variance be used. |
n_cores |
Number of cores to run for parallel processing. Currently set to 2 with the max being 4. |
alpha |
the significance level for |
perms |
the number of permutations to run for |
validation |
character. What kind of (internal) validation to use. See below. |
boots |
Number of bootstrap samples when |
model |
an optional data frame containing the variables in the model. |
parallel |
should parallelization be used. |
x |
a logical. If TRUE, the model matrix is returned. |
y |
a logical. If TRUE, the response is returned. |
object |
an object of class |
digits |
the number of decimal place to output with |
... |
additional arguments, passed to the underlying fit functions, and |
Details
The function fits a partial least squares (PLS) model with 1, ..., ncomp
number of latent variables. Multi-response models are not supported.
The type of model to fit is specified with the method argument. Currently two PLS algorithms are available: the bigiag2 algorithm ("bigiagpls" and "wrtpls").
The formula argument should be a symbolic formula of the form response ~ terms, where response is the name of the response vector and terms is the name of one or more predictor matrices, usually separated by +, e.g., y ~ X + Z. See lm
for a detailed description. The named variables should exist in the supplied data data frame or in the global environment. The chapter Statistical models in R of the manual An Introduction to R distributed with R is a good reference on formulas in R.
The number of components to fit is specified with the argument ncomp
. It this is not supplied, the maximal number of components is used.
Note that if the number of samples is <= 15, oob validation may fail. It is recommended that you PLS with validation = "loo"
.
If method = "bidiagpls"
and validation = "oob"
, bootstrap cross-validation is performed. Bootstrap confidence intervals are provided for coefficients
, weights
, loadings
, and y.loadings
. The number of bootstrap samples is specified with the argument boots
. See mvdaboot
for details.
If method = "bidiagpls"
and validation = "loo"
, leave-one-out cross-validation is performed.
If method = "bidiagpls"
and validation = "none"
, no cross-validation is performed. Note that the number of components, ncomp
, is set to min(nobj - 1, npred)
If method = "wrtpls"
and validation = "none"
, The Weight Randomization Test for the selection of the number of components is performed. Note that the number of components, ncomp
, is set to min(nobj - 1, npred)
Value
An object of class mvdareg
is returned. The object contains all components returned by the underlying fit function. In addition, it contains the following:
loadings |
X loadings |
weights |
weights |
D2.values |
bidiag2 matrix |
iD2 |
inverse of bidiag2 matrix |
Ymean |
mean of reponse variable |
Xmeans |
mean of predictor variables |
coefficients |
PLS regression coefficients |
y.loadings |
y-loadings |
scores |
X scores |
R |
orthogonal weights |
Y.values |
scaled response values |
Yactual |
actual response values |
fitted |
fitted values |
residuals |
residuals |
Xdata |
X matrix |
iPreds |
predicted values |
y.loadings2 |
scaled y-loadings |
ncomp |
number of latent variables |
method |
PLS algorithm used |
scale |
scaling used |
validation |
validation method |
call |
model call |
terms |
model terms |
model |
fitted model |
Author(s)
Nelson Lee Afanador (nelson.afanador@mvdalab.com), Thanh Tran (thanh.tran@mvdalab.com)
References
NOTE: This function is adapted from mvr
in package pls with extensive modifications by Nelson Lee Afanador and Thanh Tran.
See Also
bidiagpls.fit
, mvdaboot
, boot.plots
,
R2s
, PE
, ap.plot
,
T2
, Xresids
, smc
,
scoresplot
, ScoreContrib
, sr
,
loadingsplot
, weightsplot
, coefsplot
,
coefficientsplot2D
, loadingsplot2D
,
weightsplot2D
,
bca.cis
, coefficients.boots
, loadings.boots
,
weight.boots
, coefficients
, loadings
,
weights
, BiPlot
, jk.after.boot
Examples
### PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'oob', i.e. bootstrapping ###
data(Penta)
## Number of bootstraps set to 300 to demonstrate flexibility
## Use a minimum of 1000 (default) for results that support bootstraping
mod1 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
ncomp = 2, validation = "oob", boots = 300)
summary(mod1) #Model summary
### PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'loo', i.e. leave-one-out CV ###
## Not run:
mod2 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
ncomp = 2, validation = "loo")
summary(mod2) #Model summary
## End(Not run)
### PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'none', i.e. no CV is performed ###
## Not run:
mod3 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
ncomp = 2, validation = "none")
summary(mod3) #Model summary
## End(Not run)
### PLS MODEL FIT WITH method = 'wrtpls' and validation = 'none', i.e. WRT-PLS is performed ###
## Not run:
mod4 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1],
method = "wrtpls", validation = "none")
summary(mod4) #Model summary
plot.wrtpls(mod4)
## End(Not run)