slopeHeuristic {RMixtComp} | R Documentation |
Slope heuristic
Description
Criterion to choose the number of clusters
Usage
slopeHeuristic(object, K0 = floor(max(object$nClass) * 0.4))
Arguments
object |
output of |
K0 |
number of class for computing the constant value (see details) |
Details
The slope heuristic criterion is: LL_k - 2 C * D_k, with LL_k the loglikelihood for k classes, D_k the number of free parameters for k classes, C is the slope of the linear regression between D_k and LL_k for (k> K0)
Value
the values of the slope heuristic
Author(s)
Quentin Grimonprez
References
Cathy Maugis, Bertrand Michel. Slope heuristics for variable selection and clustering via Gaussian mixtures. [Research Report] RR-6550, INRIA. 2008. inria-00284620v2
Jean-Patrick Baudry, Cathy Maugis, Bertrand Michel. Slope Heuristics: Overview and Implementation. 2010. hal-00461639
Examples
data(titanic)
## Use the MixtComp format
dat <- titanic
# refactor categorical data: survived, sex, embarked and pclass
dat$sex <- refactorCategorical(dat$sex, c("male", "female", NA), c(1, 2, "?"))
dat$embarked <- refactorCategorical(dat$embarked, c("C", "Q", "S", NA), c(1, 2, 3, "?"))
dat$survived <- refactorCategorical(dat$survived, c(0, 1, NA), c(1, 2, "?"))
dat$pclass <- refactorCategorical(dat$pclass, c("1st", "2nd", "3rd"), c(1, 2, 3))
# replace all NA by ?
dat[is.na(dat)] <- "?"
# create model
model <- list(
pclass = "Multinomial",
survived = "Multinomial",
sex = "Multinomial",
age = "Gaussian",
sibsp = "Poisson",
parch = "Poisson",
fare = "Gaussian",
embarked = "Multinomial"
)
# create algo
algo <- createAlgo()
# run clustering
resLearn <- mixtCompLearn(dat, model, algo, nClass = 2:25, criterion = "ICL", nRun = 3, nCore = 1)
out <- slopeHeuristic(resLearn, K0 = 6)
[Package RMixtComp version 4.1.4 Index]