rm.matrix.validation {RMThreshold} | R Documentation |
Validate input matrix prior to threshold computation
Description
The function checks if the input matrix is well-conditioned for the algorithm used by RMThreshold
. The matrix must be real-valued, symmetric, and large. Rank and sparseness of the matrix are calculated. Diagnostic plots are created.
Usage
rm.matrix.validation(rand.mat, unfold.method = "gaussian",
bandwidth = "nrd0", nr.fit.points = 51, discard.outliers = TRUE)
Arguments
rand.mat |
A random, real-valued, symmetric input matrix. |
unfold.method |
A string variable that determines which type of unfolding algorithm is used. Must be one of 'gaussian' (Gaussian kernel density) or 'spline' (cubic spline interpolation on the cumulative distribution function). |
bandwidth |
Bandwidth used to calculate the Gaussian kernel density. Only active if |
nr.fit.points |
Number of evenly spaced supporting points used for the cubic spline to the empirical cumulative distribution function. |
discard.outliers |
A logical variable that determines if outliers are to be discarded from the spectrum of eigenvalues. |
Details
The input matrix must be real-valued and symmetric (a correlation or mutual information matrix self-evidently is). The matrix must not be too sparse (if so, you are probably done without thresholding). The rank of the matrix must not be too low in order to obtain a sufficient number of non-zero eigenvalues. Furthermore, the matrix must not be too small because Random Matrix Theory applies for large (theoretically infinite) matrices only. The function creates a diagnostic plot, showing the empirical eigenvalue distribution and the distribution of the spacings between them. The eigenvalue distribution of the input matrix should approximately resemble the Wigner semi-circle, while the spacings should resemble the Wigner-Dyson distribution (Wigner surmise).
Value
A list containing the following entries:
sparseness |
The sparseness of the input matrix. |
rank |
The rank of the input matrix. |
validation.plot |
The name of the valdation plot. |
unfold.plot |
The name of the plot which can be used to check if eigenvalue unfolding worked correctly. |
nr.outliers.removed |
The number of eigenvalue outliers that have been removed. Only if |
Note
It is highly recommended to check the input matrix using this function.
Author(s)
Uwe Menzel <uwemenzel@gmail.com>
References
https://en.wikipedia.org/wiki/Random_matrix
Wigner, E. P. , Characteristic vectors of bordered matrices with infinite dimensions, Ann. Math. 62, 548-564, 1955.
Mehta, M., Random Matrices, 3nd edition. Academic Press, 2004.
Furht, B. and Escalante, A. (eds.), Handbook of Data Intensive Computing, Springer Science and Business Media, 2011.
See Also
Creating a random matrix: create.rand.mat
Examples
## Run with self-created random matrix:
set.seed(777)
random.matrix <- create.rand.mat(size = 1000, distrib = "norm")$rand.matr
dim(random.matrix) # 1000 1000 should be big enough
## Not run:
res <- rm.matrix.validation(random.matrix)
res <- rm.matrix.validation(random.matrix, discard.outliers = FALSE)
res <- rm.matrix.validation(random.matrix, unfold.method = "spline")
res <- rm.matrix.validation(random.matrix, unfold.method = "spline", discard.outliers = FALSE)
## End(Not run)
## Not run:
library(igraph)
## Create noisy matrix and validate:
g <- erdos.renyi.game(1000, 0.1)
adj = as.matrix(get.adjacency(g))
rm.matrix.validation(adj) # Wigner-Dyson case, unstructured matrix, noise
## Create modular (block-diagonal) matrix and validate:
matlist = list()
for (i in 1:4) matlist[[i]] = get.adjacency(erdos.renyi.game(250, 0.1))
mat <- bdiag(matlist) # block-diagonal matrix
rm.matrix.validation(as.matrix(mat)) # Exponential case, modular matrix
## End(Not run)