Kfind.boxcox {boxcoxmix}R Documentation

Grid search over K for NPML estimation of random effect and variance component models

Description

A grid search over the parameter K, to set the best number of mass-points.

Usage

Kfind.boxcox(
  formula,
  groups = 1,
  data,
  lambda = 1,
  EMdev.change = 1e-04,
  steps = 500,
  find.k = c(2, 10),
  model.selection = "aic",
  start = "gq",
  find.tol = c(0, 1.5),
  steps.tol = 15,
  ...
)

Arguments

formula

a formula describing the transformed response and the fixed effect model (e.g. y ~ x).

groups

the random effects. To fit overdispersion models , set groups = 1.

data

a data frame containing variables used in the fixed and random effect models.

lambda

a transformation parameter, setting lambda=1 means 'no transformation'.

EMdev.change

a small scalar, with default 0.0001, used to determine when to stop EM algorithm.

steps

maximum number of iterations for the EM algorithm.

find.k

search in a range of K, with default (2,10) in step of 1.

model.selection

Set model.selection="aic", to use Akaike information criterion as model selection criterion or model.selection="bic", to use Bayesian information criterion as model selection criterion.

start

a description of the initial values to be used in the fitted model, Quantile-based version "quantile" or Gaussian Quadrature "gq" can be set.

find.tol

search in a range of tol, with default (0,1.5) in step of 1.

steps.tol

number of points in the grid search of tol.

...

extra arguments will be ignored.

Details

Not only the shape of the distribution causes the skewness it may due to the use of an insufficient number of classes, K. For this, the Kfind.boxcox() function was created to search over a selected range of K and find the best. For each number of classes, a grid search over tol is performed and the tol with the lowest aic or bic value is considered as the optimal. Having the minimal aic or bic values for a whole range of K that have been specified beforehand, the Kfind.boxcox() function can find the best number of the component as the one with the smallest value. It also plots the aic or bic values against the selected range of K, including a vertical line indicating the best value of K that minimizes the model selection criteria. The full range of classes and their corresponding optimal tol can be printed off from the Kfind.boxcox()'s output and used with other boxcoxmix functions as starting points.

Value

List with class boxcoxmix containing:

MinDisparity

the minimum disparity found.

Best.K

the value of K corresponding to MinDisparity.

AllMinDisparities

a vector containing all minimum disparities calculated on the grid.

AllMintol

list of tol values used in the grid.

All.K

list of K values used in the grid.

All.aic

the Akaike information criterion of all fitted regression models.

All.bic

the Bayesian information criterion of all fitted regression models.

Author(s)

Amani Almohaimeed and Jochen Einbeck

See Also

tolfind.boxcox.

Examples

# Fabric data
data(fabric, package = "npmlreg")
teststr<-Kfind.boxcox(y ~ x, data = fabric,  start = "gq", groups=1, 
find.k = c(2, 3), model.selection = "aic", steps.tol=5)
# Minimal AIC: 202.2114 at K= 2 










[Package boxcoxmix version 0.42 Index]