partition.BNPdens {BNPmix}R Documentation

Estimate the partition of the data

Description

The partition method estimates the partition of the data based on the output generated by a Bayesian nonparametric mixture model, according to a specified criterion, for a BNPdens class object.

Usage

## S3 method for class 'BNPdens'
partition(object, dist = "VI", max_k = NULL, ...)

Arguments

object

an object of class BNPdens;

dist

a loss function defined on the space of partitions; it can be variation of information ("VI") or "Binder", default "VI". See details;

max_k

maximum number of clusters passed to the cutree function. See value below;

...

additional arguments to be passed.

Details

This method returns point estimates for the clustering of the data induced by a nonparametric mixture model. This result is achieved exploiting two different loss fuctions on the space of partitions: variation of information (dist = 'VI') and Binder's loss (dist = 'Binder'). The function is based on the mcclust.ext code by Sara Wade (Wade and Ghahramani, 2018).

Value

The method returns a list containing a matrix with nrow(data) columns and 3 rows. Each row reports the cluster labels for each observation according to three different approaches, one per row. The first and second rows are the output of an agglomerative clustering procedure obtained by applying the function hclust to the dissimilarity matrix, and by using the complete or average linkage, respectively. The number of clusters is between 1 and max_k and is choosen according to a lower bound on the expected loss, as described in Wade and Ghahramani (2018). The third row reports the partition visited by the MCMC with the minimum distance dist from the dissimilarity matrix.

In addition, the list reports a vector with three scores representing the lower bound on the expected loss for the three partitions.

References

Wade, S., Ghahramani, Z. (2018). Bayesian cluster analysis: Point estimation and credible balls. Bayesian Analysis, 13, 559-626.

Examples

data_toy <- c(rnorm(10, -3, 1), rnorm(10, 3, 1))
grid <- seq(-7, 7, length.out = 50)
fit <- PYdensity(y = data_toy, mcmc = list(niter = 100,
                      nburn = 10, nupd = 100), output = list(grid = grid))
class(fit)
partition(fit)


[Package BNPmix version 0.2.8 Index]