Hybrid {PHclust} | R Documentation |
Calculate optimal number of clusters.
Description
This function estimates the optimal number of clusters for a given dataset.
Usage
Hybrid(data, absolute = FALSE, Kstart = NULL, Treatment)
Arguments
data |
Data matrix with dimension N*P indicating N features and P samples. |
absolute |
Logical. Whether we should use absolute (TRUE) or relative (FALSE) abundance of features to determine clusters. |
Kstart |
Positive integer. The number of clusters for starting the hybrid merging algorithm. Should be relatively large to ensure that Kstart > optimal number of clusters. Uses max(50, sqrt(N)) by default. |
Treatment |
Vector of length p, indicating replicates of different treatment groups. For example, Treatment = c(1,1,2,2,3,3) indicates 3 treatment groups, each with 2 replicates. |
Value
A positive integer indicating the optimal number of clusters
Examples
######## Run the following codes in order:
##
## This is a sample data set which has 100 features, and 4 treatment groups with 4 replicates each.
data('sample_data')
head(sample_data)
set.seed(1)
##
## Finding the optimal number of clusters
K <- Hybrid(sample_data, Kstart = 4, Treatment = rep(c(1,2,3,4), each = 4))
##
## Clustering result from EM algorithm
result <- PHcluster(sample_data, rep(c(1,2,3,4), each = 4), K, method = 'EM', nstart = 1)
print(result$cluster)
##
## Plot the feature abundance level for each cluster
plot_abundance(result, sample_data, Treatment = rep(c(1,2,3,4), each = 4))
[Package PHclust version 0.1.0 Index]