SMI {MatrixCorrelation} | R Documentation |
Similarity of Matrices Index (SMI)
Description
A similarity index for comparing coupled data matrices.
Usage
SMI(
X1,
X2,
ncomp1 = Rank(X1) - 1,
ncomp2 = Rank(X2) - 1,
projection = "Orthogonal",
Scores1 = NULL,
Scores2 = NULL,
impute = FALSE,
impute_par = list(max_iter = 20, tol = 10^-5)
)
Arguments
X1 |
first |
X2 |
second |
ncomp1 |
maximum number of subspace components from the first |
ncomp2 |
maximum number of subspace components from the second |
projection |
type of projection to apply, defaults to "Orthogonal", alternatively "Procrustes". |
Scores1 |
user supplied score- |
Scores2 |
user supplied score- |
impute |
|
impute_par |
named |
Details
A two-step process starts with extraction of stable subspaces using Principal Component Analysis or some other method yielding two orthonormal bases. These bases are compared using Orthogonal Projection (OP / ordinary least squares) or Procrustes Rotation (PR). The result is a similarity measure that can be adjusted to various data sets and contexts and which includes explorative plotting and permutation based testing of matrix subspace equality.
Value
A matrix containing all combinations of components. Its class is "SMI" associated with print, plot, summary methods.
Author(s)
Kristian Hovde Liland
References
Ulf Geir Indahl, Tormod Næs, Kristian Hovde Liland; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.
See Also
plot.SMI
(print.SMI/summary.SMI), RV
(RV2/RVadj), r1
(r2/r3/r4/GCD), Rozeboom
, Coxhead
,
allCorrelations
(matrix correlation comparison), PCAcv (cross-validated PCA)
, PCAimpute (PCA based imputation)
.
Examples
# Simulation
X1 <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2 <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])
(smi <- SMI(X1,X2,5,5))
plot(smi, B = 1000 ) # default B = 10000
# Sensory analysis
data(candy)
plot( SMI(candy$Panel1, candy$Panel2, 3,3, projection = "Procrustes"),
frame = c(2,2), B = 1000, x1lab = "Panel1", x2lab = "Panel2" ) # default B = 10000
# Missing data (100 missing completely at random points each)
X1[sort(round(runif(100)*29999+1))] <- NA
X2[sort(round(runif(100)*29999+1))] <- NA
(smi <- SMI(X1,X2,5,5, impute = TRUE))