dispersion_objective {anticlust} R Documentation

## Cluster dispersion

### Description

Compute the dispersion objective for a given clustering (i.e., the minimum distance between two elements within the same cluster).

### Usage

dispersion_objective(x, clusters)


### Arguments

 x The data input. Can be one of two structures: (1) A feature matrix where rows correspond to elements and columns correspond to variables (a single numeric variable can be passed as a vector). (2) An N x N matrix dissimilarity matrix; can be an object of class dist (e.g., returned by dist or as.dist) or a matrix where the entries of the upper and lower triangular matrix represent pairwise dissimilarities. clusters A vector representing (anti)clusters (e.g., returned by anticlustering).

### Details

The dispersion is the minimum distance between two elements within the same cluster. When the input x is a feature matrix, the Euclidean distance is used as the distance unit. Maximizing the dispersion maximizes the minimum heterogeneity within clusters and is an anticlustering task.

### References

Brusco, M. J., Cradit, J. D., & Steinley, D. (2020). Combining diversity and dispersion criteria for anticlustering: A bicriterion approach. British Journal of Mathematical and Statistical Psychology, 73, 275-396. https://doi.org/10.1111/bmsp.12186

### Examples


N <- 50 # number of elements
M <- 2  # number of variables per element
K <- 5  # number of clusters
random_data <- matrix(rnorm(N * M), ncol = M)
random_clusters <- sample(rep_len(1:K, N))
dispersion_objective(random_data, random_clusters)

# Maximize the dispersion
optimized_clusters <- anticlustering(
random_data,
K = random_clusters,
objective = dispersion_objective
)
dispersion_objective(random_data, optimized_clusters)



[Package anticlust version 0.8.1 Index]