Heatmap {DataVisualizations}R Documentation

Heatmap for Clustering

Description

Heatmap of Distances of Data sorted by Cls. Clustering algorithms provide a Classifcation of data, where the labels are defined as a numeric vector Cls. Then, a typical cluster-respectively group structure is displayed by the Heatmap function. At the margin of the heatmap a dendrogram can be shown, if hierarchical cluster algorithms are used [Wilkinson,2009]. Here the dendrogram has to be shown separately and only the heatmap itself is displayed

Usage

Heatmap(DataOrDistances,Cls,method='euclidean',

LowLim=0,HiLim,LineWidth=0.5,Clabel="Cluster No.")

Arguments

DataOrDistances

if not symmetric, then the function assumes a [1:n,1:d] numeric matrix of n data cases in rows amd d variables in columns. In this case, the distance metric specifed in method will be used.

Otherwise,

[1:n,1:n] distance matrix that is symmetric

Cls

[1:n] numerical vector of numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers for k clusters that represent the arbitrary labels of the clustering, assuming a descending order of 1 to k. If not ordered please use ClusterRenameDescendingSize. Otherwise x and y label will be incorrect.

method

Optional, if DataOrDistances is a [1:n,1:d] not symmetric numerical matrix, please see parDist for accessible distance methods, default is Euclidean

LowLim

Optional: limits for the color axis

HiLim

Optional: limits for the color axis

LineWidth

Width of lines seperating the clusters in the heatmap

Clabel

Default "Cluster No.", for large number of clusters abbrevations can be used like "Cls No." or "C" in order to fit as the x and y axis labels

Details

"Cluster heatmaps are commonly used in biology and related fields to reveal hierarchical clusters in data matrices. Heatmaps visualize a data matrix by drawing a rectangular grid corresponding to rows and columns in the matrix and coloring the cells by their values in the data matrix. In their most basic form, heatmaps have been used for over a century [Wilkinson, 2012]. In addition to coloring cells, cluster heatmaps reorder the rows and/or columns of the matrix based on the results of hierarchical clustering. (...) . Cluster heatmaps have high data density, allowing them to compact large amounts of information into a small space [Weinstein, 2008]", [Engle, 2017].

The procedure can be adapted to distance matrices [Thrun, 2018]. Then, the color scale is chosen such that pixels of low distances have blue and teal colors, pixels of middle distances yellow colors, and pixels of high distances have orange and red colors [Thrun, 2018]. The distances are ordered by the clustering and the clusters are divided by black lines. A clustering is valid if the intra-cluster distances are distinctively smaller that inter-cluster distances in the heatmap [Thrun, 2018]. For another example, please see [Thrun, 2018] (Fig. 3.7, p. 31).

Value

object of ggplot2

Author(s)

Michael Thrun

References

[Wilkinson,2009] Wilkinson, L., & Friendly, M.: The history of the cluster heat map, The American Statistician, Vol. 63(2), pp. 179-184. 2009.

[Engle et al., 2017] Engle, S., Whalen, S., Joshi, A., & Pollard, K. S.: Unboxing cluster heatmaps, BMC bioinformatics, Vol. 18(2), pp. 63. 2017.

[Weinstein, 2008] Weinstein, J. N.: A postgenomic visual icon, Science, Vol. 319(5871), pp. 1772-1773. 2008.

[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, doi:10.1007/978-3-658-20540-9, 2018.

See Also

Pixelmatrix

Examples

data("Lsun3D")
Cls=Lsun3D$Cls
Data=Lsun3D$Data

#Data
Heatmap(Data,Cls = Cls)

#Distances
Heatmap(as.matrix(dist(Data)),Cls = Cls)



[Package DataVisualizations version 1.3.2 Index]