BCSpectral {biclust} R Documentation

## The Spectral Bicluster algorithm

### Description

Performs Spectral Biclustering as described in Kluger et al., 2003. Spectral biclustering supposes that normalized microarray data matrices have a checkerboard structure that can be discovered by the use of svd decomposition in eigenvectors, applied to genes (rows) and conditions (columns).

### Usage

## S4 method for signature 'matrix,BCSpectral'
biclust(x, method=BCSpectral(), normalization="log", numberOfEigenvalues=6,
minr=2, minc=2, withinVar=1, n_clusters = NULL, n_best = 3)

### Arguments

 x The data matrix where biclusters are to be found method Here BCSpectral, to perform Spectral algorithm normalization Normalization method to apply to mat. Three methods are allowed as described by Kluger et al.: "log" (Logarithmic normalization), "irrc" (Independent Rescaling of Rows and Columns) and "bistochastization". If "log" normalization is used, be sure you can apply logarithm to elements in data matrix, if there are values under 1, it automatically will sum to each element in mat (1+abs(min(mat))) Default is "log", as recommended by Kluger et al. numberOfEigenvalues the number of eigenValues considered to find biclusters. Each row (gene) eigenVector will be combined with all column (condition) eigenVectors for the first numberOfEigenValues eigenvalues. Note that a high number could increase dramatically time performance. Usually, only the first eigenvectors are used. With "irrc" and "bistochastization" methods, first eigenvalue contains background (irrelevant) information, so it is ignored. minr minimum number of rows that biclusters must have. The algorithm will not consider smaller biclusters. minc minimum number of columns that biclusters must have. The algorithm will not consider smaller biclusters. withinVar maximum within variation allowed. Since spectral biclustering outputs a checkerboard structure despite of relevance of individual cells, a filtering of only relevant cells is necessary by means of this within variation threshold. n_clusters vector with first element the number of row clusters and second element the number of column clusters. If n_clusters = NULL, the number of clusters will be estimated. n_best number of eigenvectors to which the data is projected for the final clustering step, recommended values are 2 or 3.

### Value

Returns an object of class Biclust.

### Author(s)

Sami Leon Sami_Leon@URMC.Rochester.edu

Rodrigo Santamaria rodri@usal.es

### References

Kluger et al., "Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions", Genome Research, 2003, vol. 13, pages 703-716

### Examples

# Random matrix with embedded bicluster
test <- matrix(rnorm(5000),100,50)
test[11:20,11:20] <- rnorm(100,10,0.1)
image(test)

shuffled_test <- test[sample(nrow(test)), sample(ncol(test))]
image(shuffled_test)

# Without specifying the  number of row and column clusters
res1 <- spectral(shuffled_test,normalization="log", numberOfEigenvalues=6,
minr=2, minc=2, withinVar=1, n_clusters = NULL, n_best = 3)
res1
image(shuffled_test[order(res1@info$row_labels), order(res1@info$column_labels)])

# Specifying the  number of row and column clusters
res2 <- spectral(shuffled_test,normalization="log", numberOfEigenvalues=6,
minr=2, minc=2, withinVar=1, n_clusters = 2, n_best = 3)
res2
image(shuffled_test[order(res2@info$row_labels), order(res2@info$column_labels)])

[Package biclust version 2.0.3.1 Index]