| feemjackknife {albatross} | R Documentation |
Jack-knife outlier detection in PARAFAC models
Description
Perform leave-one-out fitting + validation of PARAFAC models on a given FEEM cube.
Usage
feemjackknife(cube, ..., progress = TRUE)
## S3 method for class 'feemjackknife'
plot(
x, kind = c('estimations', 'RIP', 'IMP'), ...
)
## S3 method for class 'feemjackknife'
coef(
object, kind = c('estimations', 'RIP', 'IMP'), ...
)
Arguments
cube |
A |
progress |
Set to |
x, object |
An object returned by |
kind |
Chooses what to plot (when called as
|
... |
|
Details
The function takes each sample out of the dataset, fits a PARAFAC model without it, then fits the outstanding sample to the model with emission and excitation factors fixed:
\hat{\mathbf{c}} =
(\mathbf{A} \ast \mathbf{B})^{+} \times \mathrm{vec}(\mathbf{X})
The individual leave-one-out models (fitted loadings
\mathbf A, \mathbf B and scores
\mathbf C) are reordered according to best Tucker's
congruence coefficient match and rescaled by minimising
|| \mathbf A \, \mathrm{diag}(\mathbf s_\mathrm A) -
\mathbf A^\mathrm{orig} ||^2
and
|| \mathbf{B} \, \mathrm{diag}(\mathbf s_\mathrm B) -
\mathbf B^\mathrm{orig} ||^2
over \mathbf s_\mathrm A and
\mathbf s_\mathrm B, subject to
\mathrm{diag}(\mathbf s_\mathrm A) \times
\mathrm{diag}(\mathbf s_\mathrm B) \times
\mathrm{diag}(\mathbf s_\mathrm C) = \mathbf I
, to make them comparable.
Once the models are fitted, resample influence plots and identity match plots can be produced from resulting data to detect outliers.
To conserve memory, feemjackknife puts the user-provided
cube in an environment and passes it via envir and
subset options of feemparafac. This means that,
unlike in feemparafac, the cube argument has
to be a feemcube object and passing envir and
subset options to feemjackknife is not supported. It
is recommended to fully name the parameters to be passed to
feemparafac to avoid problems.
plot.feemjackknife provides sane defaults for
xyplot parameters xlab, ylab,
scales, as.table, but they can be overridden.
Value
- feemjackknife
-
A list of class
feemjackknifecontaining the following entries:- overall
-
Result of fitting the overall
cubewithfeemparafac. - leaveone
-
A list of length
dim(cube)[3]containing the reduced dataset components. Everyfeemparafacobject in the list has an additionalChatattribute containing the result of fitting the excluded spectrum back to the loadings of the reduced model.
- plot.feemjackknife
-
A lattice plot object. Its
printorplotmethod will draw the plot on an appropriate plotting device. - coef.feemjackknife
-
A
data.framecontaining various columns, depending on the value of thekindargument:- estimations
-
- loading
Values of the loadings.
- mode
-
The axis of the loadings, “Emission” or “Excitation”.
- wavelength
-
Emission or excitation wavelength the loading values correspond to.
- factor
The component number.
- omitted
-
The sample (name if
cubehad names, integer if it didn't) that was omitted to get the resulting loading values.
- RIP
-
- msq.resid
-
Mean squared residual value for the model with a given sample omitted.
- Emission
-
Mean squared difference in emission mode loadings between the overall model and the model with a given sample omitted.
- Excitation
-
Mean squared difference in excitation mode loadings between the overall model and the model with a given sample omitted.
- omitted
-
The sample (name if
cubehad names, integer if it didn't) that was omitted from a given model.
- IMP
-
- score.overall
Score values for the overall model.
- score.predicted
-
Score values estimated from the loadings of the model missing a given sample.
- factor
The component number.
- omitted
-
The sample (name if
cubehad names, integer if it didn't) that was omitted from a given model.
References
Riu J, Bro R (2003). “Jack-knife technique for outlier detection and estimation of standard errors in PARAFAC models.” Chemometrics and Intelligent Laboratory Systems, 65(1), 35-49. doi:10.1016/S0169-7439(02)00090-4.
See Also
Examples
data(feems)
cube <- feemscale(feemscatter(cube, rep(14, 4)), na.rm = TRUE)
# takes a long time; the stopping criterion is weaked for speed
jk <- feemjackknife(cube, nfac = 3, ctol = 1e-4)
# feemparafac methods should be able to use the environment and subset
plot(jk$leaveone[[1]])
plot(jk)
plot(jk, 'IMP')
plot(jk, 'RIP')
head(coef(jk))