R: Similarity of Matrices Index (SMI)

SMI {MatrixCorrelation}

R Documentation

Similarity of Matrices Index (SMI)

Description

A similarity index for comparing coupled data matrices.

Usage

SMI(
  X1,
  X2,
  ncomp1 = Rank(X1) - 1,
  ncomp2 = Rank(X2) - 1,
  projection = "Orthogonal",
  Scores1 = NULL,
  Scores2 = NULL,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`ncomp1`	maximum number of subspace components from the first `matrix`.
`ncomp2`	maximum number of subspace components from the second `matrix`.
`projection`	type of projection to apply, defaults to "Orthogonal", alternatively "Procrustes".
`Scores1`	user supplied score-`matrix` to replace singular value decomposition of first `matrix`.
`Scores2`	user supplied score-`matrix` to replace singular value decomposition of second `matrix`.
`impute`	`logical` for activation of PCA based imputation for X1/X2.
`impute_par`	named `list` of imputation parameters in case of NAs in X1/X2.

Details

A two-step process starts with extraction of stable subspaces using Principal Component Analysis or some other method yielding two orthonormal bases. These bases are compared using Orthogonal Projection (OP / ordinary least squares) or Procrustes Rotation (PR). The result is a similarity measure that can be adjusted to various data sets and contexts and which includes explorative plotting and permutation based testing of matrix subspace equality.

Value

A matrix containing all combinations of components. Its class is "SMI" associated with print, plot, summary methods.

Author(s)

Kristian Hovde Liland

References

Ulf Geir Indahl, Tormod Næs, Kristian Hovde Liland; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.

Examples

# Simulation
X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
plot(smi, B = 1000 ) # default B = 10000

# Sensory analysis
data(candy)
plot( SMI(candy$Panel1, candy$Panel2, 3,3, projection = "Procrustes"),
    frame = c(2,2), B = 1000, x1lab = "Panel1", x2lab = "Panel2" ) # default B = 10000

# Missing data (100 missing completely at random points each)
X1[sort(round(runif(100)*29999+1))] <- NA
X2[sort(round(runif(100)*29999+1))] <- NA
(smi <- SMI(X1,X2,5,5, impute = TRUE))

[Package MatrixCorrelation version 0.10.0 Index]