model.selection.R {ClustMMDD}R Documentation

Selection of both the number K of clusters and the subset S of clustering variables.

Description

The inference on both the number K of clusters and the subset S of clustering variables is seen as a model selection problem. Each competing model is characterized by one value of ≤ft(K,S\right). The competing models are compared using penalized criteria AIC, BIC, ICL and a more general penalized criterion with a penalty function on the form

pen≤ft(K,S\right)=α*λ*dim≤ft(K,S\right),

where

Usage

model.selection.R(fileOrData, cte = as.double(1), alpha = as.double(2.0), header = TRUE,
  lines = integer())

Arguments

fileOrData

A character string or a data frame (see backward.explorer). If fileOrData is a data frame, it must contains a column named logLik and another named dim (see details).

cte

A penalty function parameter. The associated criterion is -log(likelihood)+cte*dim.

alpha

A coefficient in [1.5,2]. The default value is 2.

header

Indication of the presence of header in the file.

lines

A vector of integer. If not empty and fileOrData is the name of a file, only models defined in lines are compared.

Value

A data frame of the selected models for the proposed penalized criteria.

Author(s)

Wilson Toussile

References

See Also

backward.explorer, dimJump.R.

Examples

data(genotype2_ExploredModels)
outDimJump = dimJump.R(genotype2_ExploredModels, N = 1000, h = 5, header = TRUE)
cte1 = outDimJump[[1]][1]
outSlection = model.selection.R(genotype2_ExploredModels, cte = cte1, header = TRUE)
outSlection

[Package ClustMMDD version 1.0.4 Index]