MNARcluster {MNARclust}R Documentation

Clustering function

Description

Clustering method to analyze continuous or mixed-type data with missingness. The missingness mechanism can be non ignorable. The approach considers a semi-parametric mixture model.

Usage

MNARcluster(
  x,
  K,
  nbinit = 20,
  nbCPU = 1,
  tol = 0.01,
  band = band.default(x),
  seedvalue = 123
)

Arguments

x

matrix used for clustering

K

number of components

nbinit

number of random starting points

nbCPU

number of CPU used for parallel computing (only Unix and Linux systems are allowed)

tol

stopping rule

band

bandwidth (numeric vector).

seedvalue

value of the seed (used to set the initializations of the MM algorithm)

Value

Returns a list containing the proportions (proportions), matrix of probabilities of missngness (rho), the posterior probabilities of classification (classproba), the partition (zhat) and the logarithme of the smoothed-likelihood (logSmoothlike)

References

Clustering Data with Non-Ignorable Missingness using Semi-Parametric Mixture Models, Marie Du Roy de Chaumaray and Matthieu Marbac <arXiv:2009.07662>.

Examples


set.seed(123)
# Data generation
ech <- rMNAR(n=100, K=2, d=4, delta=2, gamma=2)
# Clustering
res <- MNARcluster(ech$x, K=2)
# Confusion matrix between the estimated and the true partiion
table(res$zhat, ech$z)


[Package MNARclust version 1.1.0 Index]