fhclustd {dad} | R Documentation |
Hierarchic cluster analysis of probability densities
Description
Performs functional hierarchic cluster analysis of probability densities. It returns an object of class fhclustd
. It applies hclust
to the distance matrix between the densities.
Usage
fhclustd(xf, group.name = "group", gaussiand = TRUE, distance = c("jeffreys",
"hellinger", "wasserstein", "l2", "l2norm"), windowh=NULL,
data.centered = FALSE, data.scaled = FALSE, common.variance = FALSE,
sub.title = "", filename = NULL, method.hclust = "complete")
Arguments
xf |
object of class
|
group.name |
string.
|
gaussiand |
logical. If If |
distance |
The distance or divergence used to compute the distance matrix between the densities. It can be:
If |
windowh |
either a list of Omitted when |
data.centered |
logical. If |
data.scaled |
logical. If |
common.variance |
logical. If |
sub.title |
string. If provided, the subtitle for the graphs. |
filename |
string. Name of the file in which the results are saved. By default ( |
method.hclust |
the agglomeration method to be used for the clustering. See the |
Details
In order to compute the distances/dissimilarities between the groups, the probability densities
corresponding to the
groups of individuals are either parametrically estimated (
gaussiand = TRUE
) or estimated using the Gaussian kernel method (gaussiand = FALSE
). In the latter case, the windowh
argument provides the list of the bandwidths to be used. Notice that in the multivariate case (>1), the bandwidths are positive-definite matrices.
The distances between the
groups of individuals are given by the
-distances between the
probability densities
corresponding to these groups. The
hclust
function is then applied to the distance matrix to perform the hierarchical clustering on the groups.
If windowh
is a numerical value, the matrix bandwidth is of the form , where
is either the square root of the covariance matrix (
>1) or the standard deviation of the estimated density.
If windowh = NULL
(default), in the above formula is computed using the
bandwidth.parameter
function.
The distance or dissimilarity between the estimated densities is either the distance, the Hellinger distance, Jeffreys measure (symmetrised Kullback-Leibler divergence) or the Wasserstein distance.
If it is the
L^2
distance (distance="l2"
ordistance="l2norm"
), the densities can be either parametrically estimated or estimated using the Gaussian kernel.If it is the Hellinger distance (
distance="hellinger"
), Jeffreys measure (distance="jeffreys"
) or the Wasserstein distance (distance="wasserstein"
), the densities are considered Gaussian and necessarily parametrically estimated.
Value
Returns an object of class fhclustd
, that is a list including:
distances |
matrix of the |
clust |
an object of class |
Author(s)
Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Gilles Hunault, Sabine Demotes-Mainard
See Also
fdiscd.predict, fdiscd.misclass
Examples
data(castles.dated)
stones <- castles.dated$stones
periods <- castles.dated$periods
periods123 <- periods[periods$period %in% 1:3, "castle"]
stones123 <- stones[stones$castle %in% periods123, ]
stones123$castle <- as.factor(as.character(stones123$castle))
yf <- as.folder(stones123)
# Jeffreys measure (default):
resultjef <- fhclustd(yf)
print(resultjef)
print(resultjef, dist.print = TRUE)
plot(resultjef)
plot(resultjef, hang = -1)
# Use cutree (stats package) to get the partition
cutree(resultjef$clust, k = 1:4)
cutree(resultjef$clust, k = 5)
cutree(resultjef$clust, h = 0.041)
# Applied to a data frame (Jeffreys measure):
fhclustd(stones123, group.name = "castle")
# Use cutree (stats package) to get the partition
cutree(resultjef$clust, k = 1:4)
cutree(resultjef$clust, k = 5)
cutree(resultjef$clust, h = 0.041)
# Hellinger distance:
resulthel <- fhclustd(yf, distance = "hellinger")
print(resulthel)
print(resulthel, dist.print = TRUE)
plot(resulthel)
plot(resulthel, hang = -1)
# Use cutree (stats package) to get the partition
cutree(resulthel$clust, k = 1:4)
cutree(resulthel$clust, k = 5)
cutree(resulthel$clust, h = 0.041)
## Not run:
# L2-distance:
xf <- as.folder(stones)
result <- fhclustd(xf, distance = "l2")
print(result)
print(result, dist.print = TRUE)
plot(result)
plot(result, hang = -1)
# Use cutree (stats package) to get the partition
cutree(result$clust, k = 1:5)
cutree(result$clust, k = 5)
cutree(result$clust, h = 0.18)
## End(Not run)
periods123 <- periods[periods$period %in% 1:3, "castle"]
stones123 <- stones[stones$castle %in% periods123, ]
stones123$castle <- as.factor(as.character(stones123$castle))
yf <- as.folder(stones123)
result123 <- fhclustd(yf, distance = "l2")
print(result123)
print(result123, dist.print = TRUE)
plot(result123)
plot(result123, hang = -1)
# Use cutree (stats package) to get the partition
cutree(result123$clust, k = 1:4)
cutree(result123$clust, k = 5)
cutree(result123$clust, h = 0.041)