R: Clustering function

MNARcluster {MNARclust}

R Documentation

Clustering function

Description

Clustering method to analyze continuous or mixed-type data with missingness. The missingness mechanism can be non ignorable. The approach considers a semi-parametric mixture model.

Usage

MNARcluster(
  x,
  K,
  nbinit = 20,
  nbCPU = 1,
  tol = 0.01,
  band = band.default(x),
  seedvalue = 123
)

Arguments

`x`	matrix used for clustering
`K`	number of components
`nbinit`	number of random starting points
`nbCPU`	number of CPU used for parallel computing (only Unix and Linux systems are allowed)
`tol`	stopping rule
`band`	bandwidth (numeric vector).
`seedvalue`	value of the seed (used to set the initializations of the MM algorithm)

Value

Returns a list containing the proportions (proportions), matrix of probabilities of missngness (rho), the posterior probabilities of classification (classproba), the partition (zhat) and the logarithme of the smoothed-likelihood (logSmoothlike)

References

Clustering Data with Non-Ignorable Missingness using Semi-Parametric Mixture Models, Marie Du Roy de Chaumaray and Matthieu Marbac <arXiv:2009.07662>.

Examples


set.seed(123)
# Data generation
ech <- rMNAR(n=100, K=2, d=4, delta=2, gamma=2)
# Clustering
res <- MNARcluster(ech$x, K=2)
# Confusion matrix between the estimated and the true partiion
table(res$zhat, ech$z)

[Package MNARclust version 1.1.0 Index]