dppmix_mvnorm {dppmix} | R Documentation |
Fit a determinantal point process multivariate normal mixture model.
Description
Discover clusters in multidimensional data using a multivariate normal mixture model with a determinantal point process prior.
Usage
dppmix_mvnorm(
X,
hparams = NULL,
store = NULL,
control = NULL,
fixed = NULL,
verbose = TRUE
)
Arguments
X |
|
hparams |
a list of hyperparameter values:
|
store |
a vector of character strings specifying additional vars of
interest; a value of |
control |
a list of control parameters:
|
fixed |
a list of fixed parameter values |
verbose |
whether to emit verbose message |
Details
A determinantal point process (DPP) prior is a repulsive prior. Compare to mixture models using independent priors, a DPP mixutre model will often discover a parsimonious set of mixture components (clusters).
Model fitting is done by sampling parameters from the posterior distribution using a reversible jump Markov chain Monte Carlo sampling approach.
Given , where each
is a D-dimensional real vector,
we seek the posterior distribution the latent variable
, where
each
is an integer representing cluster membership.
where is the covariance function that evaluates the distances among the
data points:
We also define , where
is an
orthonormal matrix whose column represents eigenvectors.
We further assume that
is fixed across all cluster components
so that
can be estimated as the eigenvectors of the covariance matrix of
the data matrix
. Finally, we put a prior on the entries of the
diagonal matrix:
Hence, the hyperameters of the model include:
delta, a0, b0, theta
, as well as sampling hyperparameter
sigma_pro_mu
, which controls the spread of the Gaussian
proposal distribution for the random-walk Metropolis-Hastings update of
the parameter.
The parameters (and their dimensions) in the model include:
K
, z (N x 1)
, w (K x 1)
, lambda (K x J)
,
mu (K x J)
, Sigma (J x J x K)
.
If any parameter is fixed, then K
must be fixed as well.
Value
a dppmix_mcmc
object containing posterior samples of
the parameters
References
Yanxun Xu, Peter Mueller, Donatello Telesca. Bayesian Inference for Latent Biologic Structure with Determinantal Point Processes. Biometrics. 2016;72(3):955-64.
Examples
set.seed(1)
ns <- c(3, 3)
means <- list(c(-6, -3), c(0, 4))
d <- rmvnorm_clusters(ns, means)
mcmc <- dppmix_mvnorm(d$X, verbose=FALSE)
res <- estimate(mcmc)
table(d$cl, res$z)