simca {mdatools}R Documentation

SIMCA one-class classification

Description

simca is used to make SIMCA (Soft Independent Modelling of Class Analogies) model for one-class classification.

Usage

simca(
  x,
  classname,
  ncomp = min(nrow(x) - 1, ncol(x) - 1, 20),
  x.test = NULL,
  c.test = NULL,
  cv = NULL,
  ...
)

Arguments

x

a numerical matrix with data values.

classname

short text (up to 20 symbols) with class name.

ncomp

maximum number of components to calculate.

x.test

a numerical matrix with test data.

c.test

a vector with classes of test data objects (can be text with names of classes or logical).

cv

cross-validation settings (see details).

...

any other parameters suitable for pca method.

Details

SIMCA is in fact PCA model with additional functionality, so simca class inherits most of the functionality of pca class. It uses critical limits calculated for Q and T2 residuals calculated for PCA model for making classification decistion.

Cross-validation settings, cv, can be a number or a list. If cv is a number, it will be used as a number of segments for random cross-validation (if cv = 1, full cross-validation will be preformed). If it is a list, the following syntax can be used: cv = list('rand', nseg, nrep) for random repeated cross-validation with nseg segments and nrep repetitions or cv = list('ven', nseg) for systematic splits to nseg segments ('venetian blinds').

Value

Returns an object of simca class with following fields:

classname

a short text with class name.

calres

an object of class simcares with classification results for a calibration data.

testres

an object of class simcares with classification results for a test data, if it was provided.

cvres

an object of class simcares with classification results for cross-validation, if this option was chosen.

Fields, inherited from pca class:

ncomp

number of components included to the model.

ncomp.selected

selected (optimal) number of components.

loadings

matrix with loading values (nvar x ncomp).

eigenvals

vector with eigenvalues for all existent components.

expvar

vector with explained variance for each component (in percent).

cumexpvar

vector with cumulative explained variance for each component (in percent).

T2lim

statistical limit for T2 distance.

Qlim

statistical limit for Q residuals.

info

information about the model, provided by user when build the model.

Author(s)

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

References

S. Wold, M. Sjostrom. "SIMCA: A method for analyzing chemical data in terms of similarity and analogy" in B.R. Kowalski (ed.), Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243-282.

See Also

Methods for simca objects:

print.simca shows information about the object.
summary.simca shows summary statistics for the model.
plot.simca makes an overview of SIMCA model with four plots.
predict.simca applies SIMCA model to a new data.

Methods, inherited from classmodel class:

plotPredictions.classmodel shows plot with predicted values.
plotSensitivity.classmodel shows sensitivity plot.
plotSpecificity.classmodel shows specificity plot.
plotMisclassified.classmodel shows misclassified ratio plot.

Methods, inherited from pca class:

selectCompNum.pca set number of optimal components in the model
plotScores.pca shows scores plot.
plotLoadings.pca shows loadings plot.
plotVariance.pca shows explained variance plot.
plotCumVariance.pca shows cumulative explained variance plot.
plotResiduals.pca shows Q vs. T2 residuals plot.

Examples

## make a SIMCA model for Iris setosa class with full cross-validation
library(mdatools)

data = iris[, 1:4]
class = iris[, 5]

# take first 20 objects of setosa as calibration set
se = data[1:20, ]

# make SIMCA model and apply to test set
model = simca(se, "setosa", cv = 1)
model = selectCompNum(model, 1)

# show infromation, summary and plot overview
print(model)
summary(model)
plot(model)

# show predictions
par(mfrow = c(2, 1))
plotPredictions(model, show.labels = TRUE)
plotPredictions(model, res = "cal", ncomp = 2, show.labels = TRUE)
par(mfrow = c(1, 1))

# show performance, modelling power and residuals for ncomp = 2
par(mfrow = c(2, 2))
plotSensitivity(model)
plotMisclassified(model)
plotLoadings(model, comp = c(1, 2), show.labels = TRUE)
plotResiduals(model, ncomp = 2)
par(mfrow = c(1, 1))


[Package mdatools version 0.14.1 Index]