fclustering {GET} | R Documentation |
Functional clustering
Description
Functional clustering based on a specified measure.
The options of the measures can be found in central_region
.
Usage
fclustering(
curve_sets,
k,
type = c("area", "st", "erl", "cont"),
triangineq = FALSE,
...
)
Arguments
curve_sets |
A |
k |
The number of clusters. |
type |
The measure which is used to compute the dissimilarity matrix. The preferred options
are |
triangineq |
Logical. Whether or not to compute the proportion of combinations of functions which satisfies the triangular inequality, see 'Value'. |
... |
Additional parameters to be passed to |
Details
Functional clustering joins the list of curve_set
objects in one curve_set
with long functions and
applies on the differences of all functions the specified measure. This provides a dissimilarity matrix
which is used in partitioning around medoids procedure. The resulting clusters can then be shown by plotting
the function respectively for each curve_set
. Thus for each curve_set
, the panel with all the medoids
is shown followed by all clusters represented by central region, medoid and all curves belonging to it, when
the result object is plotted.
If there are less than three curves in some of the groups, then the central region is not plotted. This leads to a warning message from ggplot2.
Value
An object having the class fclust
, containing
curve_sets = The set(s) of functions determined for clustering
k = Number of clusters
type = Type of clustering method
triangineq = The proportion of combinations of functions which satisfies the triangular inequality. The triangular inequality must hold to ensure the chosen measure forms a metric. In some weird cases it does not hold for ‘area’ measure, therefore this check is provided to ensure the data forms metric with the ‘area’ measure. The triangineq must be 1 to ensure the inequality holds for all functions.
dis = The joined dissimilarity matrix
pam = Results of the partitioning around medoids (pam) method applied on the joined functions with the dissimilarity matrix (dis). See
pam
.
References
Dai, W., Athanasiadis, S., Mrkvička, T. (2021) A new functional clustering method with combined dissimilarity sources and graphical interpretation. Intech open, London, UK. DOI: 10.5772/intechopen.100124
See Also
Examples
# Read raw data from population growth rdata
# with countries over million inhabitants
data("popgrowthmillion")
# Create centred data
m <- apply(popgrowthmillion, 2, mean) # Country-wise means
cpopgrowthmillion <- popgrowthmillion
for(i in 1:dim(popgrowthmillion)[1]) {
cpopgrowthmillion[i,] <- popgrowthmillion[i,] - m
}
# Create scaled data
t2 <- function(v) { sqrt(sum(v^2)) }
s <- apply(cpopgrowthmillion, 2, t2)
spopgrowthmillion <- popgrowthmillion
for(i in 1:dim(popgrowthmillion)[1]) {
spopgrowthmillion[i,] <- cpopgrowthmillion[i,]/s
}
# Create curve sets
r <- 1951:2015
cset1 <- curve_set(r = r, obs = popgrowthmillion)
cset2 <- curve_set(r = r, obs = spopgrowthmillion)
csets <- list(Raw = cset1, Shape = cset2)
# Functional clustering with respect to joined "st" difference measure
# and "joined" central regions of each group
res <- fclustering(csets, k=3, type="area")
p <- plot(res, plotstyle = "marginal", coverage = 0.5)
p[[1]] # Central functions
p[[2]] # Groups: central functions and regions
# To collect the two figures into one use, e.g., patchwork:
if(require("patchwork", quietly=TRUE)) {
p[[1]] + p[[2]] + plot_layout(widths = c(1, res$k))
}
# Silhouette plot of pam
plot(res$pam)