Heatmap {DataVisualizations} | R Documentation |
Heatmap for Clustering
Description
Heatmap of Distances of Data sorted by Cls. Clustering algorithms provide a Classifcation of data, where the labels are defined as a numeric vector Cls
. Then, a typical cluster-respectively group structure is displayed by the Heatmap
function.
At the margin of the heatmap a dendrogram can be shown, if hierarchical cluster algorithms are used [Wilkinson,2009].
Here the dendrogram has to be shown separately and only the heatmap itself is displayed
Usage
Heatmap(DataOrDistances,Cls,method='euclidean',
LowLim=0,HiLim,LineWidth=0.5,Clabel="Cluster No.")
Arguments
DataOrDistances |
if not symmetric, then the function assumes a [1:n,1:d] numeric matrix of n data cases in rows amd d variables in columns. In this case, the distance metric specifed in Otherwise, [1:n,1:n] distance matrix that is symmetric |
Cls |
[1:n] numerical vector of numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers for k clusters that represent the arbitrary labels of the clustering, assuming a descending order of 1 to k. If not ordered please use |
method |
Optional,
if |
LowLim |
Optional: limits for the color axis |
HiLim |
Optional: limits for the color axis |
LineWidth |
Width of lines seperating the clusters in the heatmap |
Clabel |
Default " |
Details
"Cluster heatmaps are commonly used in biology and related fields to reveal hierarchical clusters in data matrices. Heatmaps visualize a data matrix by drawing a rectangular grid corresponding to rows and columns in the matrix and coloring the cells by their values in the data matrix. In their most basic form, heatmaps have been used for over a century [Wilkinson, 2012]. In addition to coloring cells, cluster heatmaps reorder the rows and/or columns of the matrix based on the results of hierarchical clustering. (...) . Cluster heatmaps have high data density, allowing them to compact large amounts of information into a small space [Weinstein, 2008]", [Engle, 2017].
The procedure can be adapted to distance matrices [Thrun, 2018]. Then, the color scale is chosen such that pixels of low distances have blue and teal colors, pixels of middle distances yellow colors, and pixels of high distances have orange and red colors [Thrun, 2018]. The distances are ordered by the clustering and the clusters are divided by black lines. A clustering is valid if the intra-cluster distances are distinctively smaller that inter-cluster distances in the heatmap [Thrun, 2018]. For another example, please see [Thrun, 2018] (Fig. 3.7, p. 31).
Value
object of ggplot2
Author(s)
Michael Thrun
References
[Wilkinson,2009] Wilkinson, L., & Friendly, M.: The history of the cluster heat map, The American Statistician, Vol. 63(2), pp. 179-184. 2009.
[Engle et al., 2017] Engle, S., Whalen, S., Joshi, A., & Pollard, K. S.: Unboxing cluster heatmaps, BMC bioinformatics, Vol. 18(2), pp. 63. 2017.
[Weinstein, 2008] Weinstein, J. N.: A postgenomic visual icon, Science, Vol. 319(5871), pp. 1772-1773. 2008.
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, doi:10.1007/978-3-658-20540-9, 2018.
See Also
Examples
data("Lsun3D")
Cls=Lsun3D$Cls
Data=Lsun3D$Data
#Data
Heatmap(Data,Cls = Cls)
#Distances
Heatmap(as.matrix(dist(Data)),Cls = Cls)