randtest {mdatools} | R Documentation |
Randomization test for PLS regression
Description
randtest
is used to carry out randomization/permutation test for a PLS regression model
Usage
randtest(
x,
y,
ncomp = 15,
center = TRUE,
scale = FALSE,
nperm = 1000,
sig.level = 0.05,
silent = TRUE,
exclcols = NULL,
exclrows = NULL
)
Arguments
x |
matrix with predictors. |
y |
vector or one-column matrix with response. |
ncomp |
maximum number of components to test. |
center |
logical, center or not predictors and response values. |
scale |
logical, scale (standardize) or not predictors and response values. |
nperm |
number of permutations. |
sig.level |
significance level. |
silent |
logical, show or not test progress. |
exclcols |
columns of x to be excluded from calculations (numbers, names or vector with logical values) |
exclrows |
rows to be excluded from calculations (numbers, names or vector with logical values) |
Details
The class implements a method for selection of optimal number of components in PLS1 regression
based on the randomization test [1]. The basic idea is that for each component from 1 to
ncomp
a statistic T, which is a covariance between t-score (X score, derived from a PLS
model) and the reference Y values, is calculated. By repeating this for randomly permuted
Y-values a distribution of the statistic is obtained. A parameter alpha
is computed to
show how often the statistic T, calculated for permuted Y-values, is the same or higher than
the same statistic, calculated for original data without permutations.
If a component is important, then the covariance for unpermuted data should be larger than the
covariance for permuted data and therefore the value for alpha
will be quie small (there
is still a small chance to get similar covariance). This makes alpha
very similar to
p-value in a statistical test.
The randtest
procedure calculates alpha for each component, the values can be observed
using summary
or plot
functions. There are also several function, allowing e.g.
to show distribution of statistics and the critical value for each component.
Value
Returns an object of randtest
class with following fields:
nperm |
number of permutations used for the test. |
stat |
statistic values calculated for each component. |
alpha |
alpha values calculated for each component. |
statperm |
matrix with statistic values for each permutation. |
corrperm |
matrix with correlation between predicted and reference y-vales for each permutation. |
ncomp.selected |
suggested number of components. |
References
S. Wiklund et al. Journal of Chemometrics 21 (2007) 427-439.
See Also
Methods for randtest
objects:
print.randtest | prints information about a randtest object. |
summary.randtest | shows summary statistics for the test. |
plot.randtest | shows bar plot for alpha values. |
plotHist.randtest | shows distribution of statistic plot. |
plotCorr.randtest | shows determination coefficient plot. |
Examples
### Examples of using the test
## Get the spectral data from Simdata set and apply SNV transformation
data(simdata)
y = simdata$conc.c[, 3]
x = simdata$spectra.c
x = prep.snv(x)
## Run the test and show summary
## (normally use higher nperm values > 1000)
r = randtest(x, y, ncomp = 4, nperm = 200, silent = FALSE)
summary(r)
## Show plots
par( mfrow = c(3, 2))
plot(r)
plotHist(r, ncomp = 3)
plotHist(r, ncomp = 4)
plotCorr(r, 3)
plotCorr(r, 4)
par( mfrow = c(1, 1))