dispersion_objective {anticlust} R Documentation

## Cluster dispersion

### Description

Compute the dispersion objective for a given clustering (i.e., the minimum distance between two elements within the same cluster).

### Usage

```dispersion_objective(x, clusters)
```

### Arguments

 `x` The data input. Can be one of two structures: (1) A feature matrix where rows correspond to elements and columns correspond to variables (a single numeric variable can be passed as a vector). (2) An N x N matrix dissimilarity matrix; can be an object of class `dist` (e.g., returned by `dist` or `as.dist`) or a `matrix` where the entries of the upper and lower triangular matrix represent pairwise dissimilarities. `clusters` A vector representing (anti)clusters (e.g., returned by `anticlustering`).

### Details

The dispersion is the minimum distance between two elements within the same cluster. When the input `x` is a feature matrix, the Euclidean distance is used as the distance unit. Maximizing the dispersion maximizes the minimum heterogeneity within clusters and is an anticlustering task.

### References

Brusco, M. J., Cradit, J. D., & Steinley, D. (2020). Combining diversity and dispersion criteria for anticlustering: A bicriterion approach. British Journal of Mathematical and Statistical Psychology, 73, 275-396. https://doi.org/10.1111/bmsp.12186

### Examples

```
N <- 50 # number of elements
M <- 2  # number of variables per element
K <- 5  # number of clusters
random_data <- matrix(rnorm(N * M), ncol = M)
random_clusters <- sample(rep_len(1:K, N))
dispersion_objective(random_data, random_clusters)

# Maximize the dispersion
optimized_clusters <- anticlustering(
random_data,
K = random_clusters,
objective = dispersion_objective
)
dispersion_objective(random_data, optimized_clusters)

```

