hier_clust {tidyclust}R Documentation

Hierarchical (Agglomerative) Clustering

Description

hier_clust() defines a model that fits clusters based on a distance-based dendrogram

There are different ways to fit this model, and the method of estimation is chosen by setting the model engine. The engine-specific pages for this model are listed below.

Usage

hier_clust(
  mode = "partition",
  engine = "stats",
  num_clusters = NULL,
  cut_height = NULL,
  linkage_method = "complete"
)

Arguments

mode

A single character string for the type of model. The only possible value for this model is "partition".

engine

A single character string specifying what computational engine to use for fitting. Possible engines are listed below. The default for this model is "stats".

num_clusters

Positive integer, number of clusters in model (optional).

cut_height

Positive double, height at which to cut dendrogram to obtain cluster assignments (only used if num_clusters is NULL)

linkage_method

the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

Details

What does it mean to predict?

To predict the cluster assignment for a new observation, we find the closest cluster. How we measure “closeness” is dependent on the specified type of linkage in the model:

Value

A hier_clust cluster specification.

Examples

# Show all engines
modelenv::get_from_env("hier_clust")

hier_clust()

[Package tidyclust version 0.2.1 Index]