R: Performs Parametric Bootstrap for Sequentially Testing the...

boot.comp {mixtools}

R Documentation

Performs Parametric Bootstrap for Sequentially Testing the Number of Components in Various Mixture Models

Description

Performs a parametric bootstrap by producing B bootstrap realizations of the likelihood ratio statistic for testing the null hypothesis of a k-component fit versus the alternative hypothesis of a (k+1)-component fit to various mixture models. This is performed for up to a specified number of maximum components, k. A p-value is calculated for each test and once the p-value is above a specified significance level, the testing terminates. An optional histogram showing the distribution of the likelihood ratio statistic along with the observed statistic can also be produced.

Usage

boot.comp(y, x = NULL, N = NULL, max.comp = 2, B = 100,
          sig = 0.05, arbmean = TRUE, arbvar = TRUE,
          mix.type = c("logisregmix", "multmix", "mvnormalmix",
          "normalmix", "poisregmix", "regmix", "regmix.mixed", 
          "repnormmix"), hist = TRUE, ...)

Arguments

`y`	The raw data for `multmix`, `mvnormalmix`, `normalmix`, and `repnormmix` and the response values for `logisregmix`, `poisregmix`, and `regmix`. See the documentation concerning their respective EM algorithms for specific structure of the raw data.
`x`	The predictor values required only for the regression mixtures `logisregmix`, `poisregmix`, and `regmix`. A column of 1s for the intercept term must not be included! See the documentation concerning their respective EM algorithms for specific structure of the predictor values.
`N`	An n-vector of number of trials for the logistic regression type `logisregmix`. If NULL, then `N` is an n-vector of 1s for binary logistic regression.
`max.comp`	The maximum number of components to test for. The default is 2. This function will perform a test of k-components versus (k+1)-components sequentially until we fail to reject the null hypothesis. This decision rule is governed by the calculated p-value and `sig`.
`B`	The number of bootstrap realizations of the likelihood ratio statistic to produce. The default is 100, but ideally, values of 1000 or more would be more acceptable.
`sig`	The significance level for which to compare the p-value against when performing the test of k-components versus (k+1)-components.
`arbmean`	If FALSE, then a scale mixture analysis can be performed for `mvnormalmix`, `normalmix`, `regmix`, or `repnormmix`. The default is TRUE.
`arbvar`	If FALSE, then a location mixture analysis can be performed for `mvnormalmix`, `normalmix`, `regmix`, or `repnormmix`. The default is TRUE.
`mix.type`	The type of mixture analysis you wish to perform. The data inputted for `y` and `x` depend on which type of mixture is selected. `logisregmix` corresponds to a mixture of logistic regressions. `multmix` corresponds to a mixture of multinomials with data determined by the cut-point method. `mvnormalmix` corresponds to a mixture of multivariate normals. `normalmix` corresponds to a mixture of univariate normals. `poisregmix` corresponds to a mixture of Poisson regressions. `regmix` corresponds to a mixture of regressions with normal components. `regmix.mixed` corresponds to a mixture of regressions with random or mixed effects. `repnormmix` corresponds to a mixture of normals with repeated measurements.
`hist`	An argument to provide a matrix plot of histograms for the boostrapped likelihood ratio statistic.
`...`	Additional arguments passed to the various EM algorithms for the mixture of interest.

Value

boot.comp returns a list with items:

`p.values`	The p-values for each test of k-components versus (k+1)-components.
`log.lik`	The B bootstrap realizations of the likelihood ratio statistic.
`obs.log.lik`	The observed likelihood ratio statistic for each test which is used in determining the p-values.

References

McLachlan, G. J. and Peel, D. (2000) Finite Mixture Models, John Wiley and Sons, Inc.

Examples

## Bootstrapping to test the number of components on the RTdata.

data(RTdata)
set.seed(100)
x <- as.matrix(RTdata[, 1:3])
y <- makemultdata(x, cuts = quantile(x, (1:9)/10))$y
a <- boot.comp(y = y, max.comp = 1, B = 5, mix.type = "multmix", 
               epsilon = 1e-3)
a$p.values

[Package mixtools version 2.0.0 Index]