randtest {mdatools}R Documentation

Randomization test for PLS regression

Description

randtest is used to carry out randomization/permutation test for a PLS regression model

Usage

randtest(
  x,
  y,
  ncomp = 15,
  center = TRUE,
  scale = FALSE,
  nperm = 1000,
  sig.level = 0.05,
  silent = TRUE,
  exclcols = NULL,
  exclrows = NULL
)

Arguments

x

matrix with predictors.

y

vector or one-column matrix with response.

ncomp

maximum number of components to test.

center

logical, center or not predictors and response values.

scale

logical, scale (standardize) or not predictors and response values.

nperm

number of permutations.

sig.level

significance level.

silent

logical, show or not test progress.

exclcols

columns of x to be excluded from calculations (numbers, names or vector with logical values)

exclrows

rows to be excluded from calculations (numbers, names or vector with logical values)

Details

The class implements a method for selection of optimal number of components in PLS1 regression based on the randomization test [1]. The basic idea is that for each component from 1 to ncomp a statistic T, which is a covariance between t-score (X score, derived from a PLS model) and the reference Y values, is calculated. By repeating this for randomly permuted Y-values a distribution of the statistic is obtained. A parameter alpha is computed to show how often the statistic T, calculated for permuted Y-values, is the same or higher than the same statistic, calculated for original data without permutations.

If a component is important, then the covariance for unpermuted data should be larger than the covariance for permuted data and therefore the value for alpha will be quie small (there is still a small chance to get similar covariance). This makes alpha very similar to p-value in a statistical test.

The randtest procedure calculates alpha for each component, the values can be observed using summary or plot functions. There are also several function, allowing e.g. to show distribution of statistics and the critical value for each component.

Value

Returns an object of randtest class with following fields:

nperm

number of permutations used for the test.

stat

statistic values calculated for each component.

alpha

alpha values calculated for each component.

statperm

matrix with statistic values for each permutation.

corrperm

matrix with correlation between predicted and reference y-vales for each permutation.

ncomp.selected

suggested number of components.

References

S. Wiklund et al. Journal of Chemometrics 21 (2007) 427-439.

See Also

Methods for randtest objects:

print.randtest prints information about a randtest object.
summary.randtest shows summary statistics for the test.
plot.randtest shows bar plot for alpha values.
plotHist.randtest shows distribution of statistic plot.
plotCorr.randtest shows determination coefficient plot.

Examples

### Examples of using the test

## Get the spectral data from Simdata set and apply SNV transformation

data(simdata)

y = simdata$conc.c[, 3]
x = simdata$spectra.c
x = prep.snv(x)

## Run the test and show summary
## (normally use higher nperm values > 1000)
r = randtest(x, y, ncomp = 4, nperm = 200, silent = FALSE)
summary(r)

## Show plots

par( mfrow = c(3, 2))
plot(r)
plotHist(r, ncomp = 3)
plotHist(r, ncomp = 4)
plotCorr(r, 3)
plotCorr(r, 4)
par( mfrow = c(1, 1))


[Package mdatools version 0.14.1 Index]