feemcorcondia {albatross} | R Documentation |
Core consistency diagnostic for PARAFAC models
Description
Compute the core consistency diagnostic (“CORCONDIA”) by fitting a “Tucker3” core array to the existing PARAFAC loadings.
Usage
feemcorcondia(
model, divisor = c("nfac", "core"),
kind = c('pinv', 'iterative', 'vec'), ...
)
## S3 method for class 'feemcorcondia'
print(x, ...)
Arguments
model |
A PARAFAC model returned by |
divisor |
The divisor used in computation of the CORCONDIA value, see Details. |
kind |
A string choosing the method used to compute the least squares Tucker3 core. Defaults to “pinv” for PARAFAC models without missing data and “iterative” for models where missing data is present. See Details. |
x |
An object returned by |
... |
|
Details
The “Tucker3” model uses three loading matrices and a small three-way “core array” to describe a larger three-way array:
X_{i,j,k} = \sum_r \sum_s \sum_t A_{i,r} B_{j,s} C_{k,t} G_{r,s,t}
It's easy to show that constraining
G_{r,s,t} = 1_{r = s = t}
makes the Tucker3 model equivalent to a PARAFAC model. The core
consistency diagnostic works by constraining the loading matrices of a
Tucker3 model to the existing loading matrices from a PARAFAC model
and estimating the core array. The closer the resulting
\mathbf{G}
tensor is to a diagonal one, the
better.
Given the least-squares estimated core tensor
\mathbf{G}
, the ideal core tensor
T_{r,s,t} = 1_{r = s = t}
and the denominator
D
, the CORCONDIA metric is defined as follows:
\left(
1 - \frac{
\sum_r \sum_s \sum_t (G_{r,s,t} - T_{r,s,t})^2
}{D}
\right) \cdot 100\%
The denominator can be chosen to be either
\sum_r \sum_s \sum_t T_{r,s,t}^2
, which is
equal to the number of factors in the model (divisor = 'nfac'
),
or
\sum_r \sum_s \sum_t G_{r,s,t}^2
, which will
avoid resulting negative values (divisor = 'core'
).
There are multiple ways how the least squares Tucker3 core can be
computed. When no data is missing, the matricised form of the model
can be used to derive the expression (kind = 'pinv'
, the
default in such cases):
\mathbf{X} =
\mathbf{A} \mathbf{G} (\mathbf{C} \otimes \mathbf{B})^\top
+ \epsilon
\hat{\mathbf{G}} =
\mathbf{A}^{+} \mathbf{X} (
(\mathbf{C}^\top)^{+} \otimes
(\mathbf{B}^\top)^{+}
)
With missing data present, a binary matrix of weights
\mathbf{W}
appears:
\min_\mathbf{G} \left\| \mathbf W \circ (
\mathbf{A} \mathbf{G} (\mathbf{C} \otimes \mathbf{B})^\top
- \mathbf{X}
) \right\|^2
A gradient-based method can be used to solve this problem iteratively
without allocating too much memory, but care must be taken to ensure
convergence. For kind = 'iterative'
(which is the default for
models with missing data), optim
is used with
parameters method = 'L-BFGS-B'
,
control = list(maxit = 1000, factr = 1)
. Warnings will be
produced if the method doesn't indicate successful convergence.
The problem can also be solved exactly by unfolding the tensor into a vector and skipping the elements marked as missing:
\min_\mathbf{G} \left\|
\mbox{vec}(
\mathbf{A} \mathbf{G} (\mathbf{C} \otimes \mathbf{B})^\top
)_{\mbox{non-missing}}
- \mbox{vec}(\mathbf{X})_{\mbox{non-missing}}
\right\|^2
\mbox{vec}(\mathbf{C} \otimes \mathbf{B} \otimes \mathbf{A})
_{\mbox{non-missing}}
\times \mbox{vec}\:\mathbf{G}
= \mbox{vec}(\mathbf{X})_{\mbox{non-missing}}
Unfortunately, when this method is used (kind = 'vec'
), the
left-hand side of the equation has the size of
\mathbf{X}
times the number of components cubed,
which grows very quickly for models with large numbers of components.
Value
A numeric scalar of class feemcorcondia
with the following
attributes:
divisor |
The |
core |
A three-way array containing the least-squares Tucker core for the given PARAFAC model. |
References
Bro R, Kiers HAL (2003). “A new efficient method for determining the number of components in PARAFAC models.” Journal of Chemometrics, 17(5), 274-286. ISSN 0886-9383, doi:10.1002/cem.801.
See Also
multiway::corcondia
Examples
data(feems)
cube <- feemscale(feemscatter(cube, c(20, 14)), na.rm = TRUE)
# kind = 'vec' is exact but may take a lot of memory
feemcorcondia(feemparafac(cube, nfac = 3, ctol = 1e-4), kind = 'vec')
# kind = 'iterative' used by default for models with missing data
feemcorcondia(feemparafac(cube, nfac = 4, ctol = 1e-4))