datacapushe {capushe}R Documentation

datacapushe

Description

A dataframe example for the capushe package based on a simulated Gaussian mixture dataset in R3\R^3.

Usage

data(datacapushe)

Format

A data frame with 50 rows (models) and the following 4 variables:

model

a character vector

: model names.

pen

a numeric vector

: model penalty shape values.

complexity

a numeric vector

: model complexity values.

contrast

a numeric vector

: model contrast values.

Details

The simulated dataset is composed of n=1000n=1000 observations in R3\R^3. It consists of an equiprobable mixture of three large "bubble" groups centered at ν1=(0,0,0)\nu_1=(0,0,0), ν2=(6,0,0)\nu_2=(6,0,0) and ν3=(0,6,0)\nu_3=(0,6,0) respectively. Each bubble group jj is simulated from a mixture of seven components according to the following density distribution:

xR30.4Φ(xμ1+νj,I3)+k=270.1Φ(xμk+νj,0.1I3)x\in\R^3\rightarrow 0.4\Phi(x|\mu_1+\nu_j,I_3)+\sum_{k=2}^70.1\Phi(x|\mu_k+\nu_j,0.1I_3)

with μ1=(0,0,0)\mu_1=(0,0,0), μ2=(0,0,1.5)\mu_2=(0,0,1.5), μ3=(0,1.5,0)\mu_3=(0,1.5,0), μ4=(1.5,0,0,)\mu_4=(1.5,0,0,), μ5=(0,0,1.5)\mu_5=(0,0,-1.5), μ6=(0,1.5,0)\mu_6=(0,-1.5,0) and μ7=(1.5,0,0,)\mu_7=(-1.5,0,0,). Thus the distribution of the dataset is actually a 2121-component Gaussian mixture.

A model collection of spherical Gaussian mixtures is considered and the dataframe datacapushe contains the maximum likelihood estimations for each of these models. The number of free parameters of each model is used for the complexity values and penshapepen_{shape} is defined by this complexity divided by nn.

datapartialcapushe and datavalidcapushe can be used to run the validation function. datapartialcapushe only contains the models with less than 2121 components. datavalidcapushe contains three models with 3030, 4040 and 5050 components respectively.

Source

http://www.math.univ-toulouse.fr/~maugis/CAPUSHE.html

References

Article: Baudry, J.-P., Maugis, C. and Michel, B. (2011) Slope heuristics: overview and implementation. Statistics and Computing, to appear. doi: 10.1007/ s11222-011-9236-1

Examples

data(datacapushe)
capushe(datacapushe,n=1000)
## BIC, DDSE and Djump all three select the true model
plot(capushe(datacapushe))
## Validation:
data(datapartialcapushe)
capushepartial=capushe(datapartialcapushe)
data(datavalidcapushe)
validation(capushepartial,datavalidcapushe) ## The slope heuristics should not 
## be applied for datapartialcapushe.

[Package capushe version 1.1.2 Index]