partition.BNPdens {BNPmix} | R Documentation |
Estimate the partition of the data
Description
The partition
method estimates the partition of the data based on the output generated by a Bayesian nonparametric mixture
model, according to a specified criterion, for a BNPdens
class object.
Usage
## S3 method for class 'BNPdens'
partition(object, dist = "VI", max_k = NULL, ...)
Arguments
object |
an object of class |
dist |
a loss function defined on the space of partitions;
it can be variation of information ( |
max_k |
maximum number of clusters passed to the |
... |
additional arguments to be passed. |
Details
This method returns point estimates for the clustering of the data induced by a nonparametric mixture model.
This result is achieved exploiting two different loss fuctions on the space of partitions: variation of information
(dist = 'VI'
) and Binder's loss (dist = 'Binder'
). The function is based on the mcclust.ext
code by Sara Wade (Wade and Ghahramani, 2018).
Value
The method returns a list containing a matrix with nrow(data)
columns and 3 rows. Each row reports
the cluster labels for each observation according to three different approaches, one per row. The first and second rows
are the output of an agglomerative clustering procedure obtained by applying the function hclust
to the dissimilarity matrix, and by using the complete or average linkage,
respectively. The number of clusters is between 1 and max_k
and is choosen according to a lower bound
on the expected loss, as described in Wade and Ghahramani (2018).
The third row reports the partition visited by the MCMC with the minimum distance dist
from the dissimilarity matrix.
In addition, the list reports a vector with three scores representing the lower bound on the expected loss for the three partitions.
References
Wade, S., Ghahramani, Z. (2018). Bayesian cluster analysis: Point estimation and credible balls. Bayesian Analysis, 13, 559-626.
Examples
data_toy <- c(rnorm(10, -3, 1), rnorm(10, 3, 1))
grid <- seq(-7, 7, length.out = 50)
fit <- PYdensity(y = data_toy, mcmc = list(niter = 100,
nburn = 10, nupd = 100), output = list(grid = grid))
class(fit)
partition(fit)