backward.explorer {ClustMMDD}R Documentation

Gather a set of the most competitive models.

Description

This function gathers a set of the most competitive models using a backward-stepwise strategy. The visited models are gathered in a file with suffix "_ExploredModels.txt". The algorithm used is described in Wilson Toussile and Elisabeth Gassiat (2009).

Usage

backward.explorer(x, Kmax, Criterion, ploidy = 1,
  ForceExclusion = FALSE, emOptions = list(epsi = NULL, nberSmallEM = NULL,
  nberIterations = NULL, nberMaxIterations = NULL, typeSmallEM = NULL, typeEM =
  NULL, putThreshold = NULL), Kmin = 1, Smin = NULL,
  project = deparse(substitute(x)))

Arguments

x

A matrix of string that contains data.

Kmax

The maximum number of clusters to be explored.

Criterion

The model selection criterion in c("BIC", "AIC", "ICL", "CteDim") used for exploration (see details).

ploidy

The number of columns for each variable in the data. For example, ploidy = 2 for genotypic data from diploid individual.

ForceExclusion

The indication of whether to force exclusion or not. The default value is set to FALSE.

emOptions

A list of EM options (see EmOptions and setEmOptions).

Kmin

The minimum number of clusters. The default value is set to 1.

Smin

A logical vector that indicates the variables to include in the selected set of clustering variables. The default value NULL: no variable is preselected.

project

The name of the project. The default value is the name of the dataset.

Details

If the penalized criteria is CteDim, a sequence of penalty functions of the form pen≤ft(K,S\right)=λ*dim≤ft(K,S\right) is used. In this shape of penalty function, λ is in [0.5, log(N)], where N is the number of individuals in the sample data. Thus, AIC and BIC penalties are in the sequence of candidate penalties.

Value

A data.frame of selected models for the choosen proposed criteria.

Author(s)

Wilson Toussile

References

See Also

dimJump.R for the data driven calibration of the penalty function, and model.selection.R for the final model selection.

Examples

data(genotype1)
head(genotype1) 
genotype2 = cutEachCol(genotype1[, -11], ploidy = 2)
head(genotype2)

# The following command create a file "genotype2_ExploredModels.txt" 
# that contains the most competitive models.

#output = backward.explorer(genotype2, Kmax = 10, ploidy = 2, Kmin = 1, Criterion = "CteDim")

data(genotype2_ExploredModels)
head(genotype2_ExploredModels)

[Package ClustMMDD version 1.0.4 Index]