dimRed {qlcMatrix}R Documentation

Dimensionality Reduction for sparse matrices, based on Cholesky decomposition

Description

To inspect the structure of a large sparse matrix, it is often highly useful to reduce the matrix to a few major dimensions (cf. multidimensional scaling). This functions implements a rough approach to provide a few major dimensions. The function provides a simple wrapper around Cholesky and sparsesvd.

Usage

dimRed(sim, k = 2, method = "svd")

Arguments

sim

Sparse, symmetric, positive-definite matrix (typically a similarity matrix produces by sim or assoc functions)

k

Number of dimensions to be returned, defaults to two.

method

Method used for the decomposition. Currently implemted are svd and cholesky.

Details

Based on the Cholesky decomposition, the Matrix sim is decomposed into:

L D L'

The D Matrix is a diagonal matrix, the values of which are returned here as $D. Only the first few columns of the L Matrix are returned (possibly after permutation, see the details at Cholesky).

Based on the svd decomposition, the Matrix sim is decomposed into:

U D V

The U Matrix and the values from D are returned.

Value

A list of two elements is returned:

L

: a sparse matrix of type dgCMatrix with k columns

D

: the diagional values from the Cholesky decomposition, or the eigenvalues from the svd decomposition

Author(s)

Michael Cysouw <cysouw@mac.com>

See Also

See Also as Cholesky and sparsesvd

Examples

# some random points in two dimensions
coor <- cbind(sample(1:30), sample(1:30))

# using cmdscale() to reconstruct the coordinates from a distance matrix
d <- dist(coor)
mds <- cmdscale(d)

# using dimRed() on a similarity matrix.
# Note that normL works much better than other norms in this 2-dimensional case
s <- cosSparse(t(coor), norm = normL)
red <- as.matrix(dimRed(s)$L)

# show the different point clouds

oldpar<-par("mfrow")
par(mfrow = c(1,3))

  plot(coor, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(coor, labels = 1:30)
  title("Original coordinates")
  
  plot(mds, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(mds, labels = 1:30)
  title("MDS from euclidean distances")
  
  plot(red, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(red, labels = 1:30)
  title("dimRed from cosSparse similarity")

par(mfrow = oldpar)

# ======

# example, using the iris data
data(iris)
X <- t(as.matrix(iris[,1:4]))
cols <- rainbow(3)[iris$Species]

s <- cosSparse(X, norm = norm1)
d <- dist(t(X), method = "manhattan")

svd <- as.matrix(dimRed(s, method = "svd")$L)
mds <- cmdscale(d)

oldpar<-par("mfrow")
par(mfrow = c(1,2))
  plot(mds, col = cols, main = "cmdscale\nfrom euclidean distances")
  plot(svd, col = cols, main = "dimRed with svd\nfrom cosSparse with norm1")
par(mfrow = oldpar)

[Package qlcMatrix version 0.9.8 Index]