R: Check for zero-inflation in count models

check_zeroinflation {performance}

R Documentation

Check for zero-inflation in count models

Description

check_zeroinflation() checks whether count models are over- or underfitting zeros in the outcome.

Usage

check_zeroinflation(x, ...)

## Default S3 method:
check_zeroinflation(x, tolerance = 0.05, ...)

## S3 method for class 'performance_simres'
check_zeroinflation(
  x,
  tolerance = 0.1,
  alternative = c("two.sided", "less", "greater"),
  ...
)

Arguments

`x`	Fitted model of class `merMod`, `glmmTMB`, `glm`, or `glm.nb` (package MASS).
`...`	Arguments passed down to `simulate_residuals()`. This only applies for models with zero-inflation component, or for models of class `glmmTMB` from `nbinom1` or `nbinom2` family.
`tolerance`	The tolerance for the ratio of observed and predicted zeros to considered as over- or underfitting zeros. A ratio between 1 +/- `tolerance` is considered as OK, while a ratio beyond or below this threshold would indicate over- or underfitting.
`alternative`	A character string specifying the alternative hypothesis.

Details

If the amount of observed zeros is larger than the amount of predicted zeros, the model is underfitting zeros, which indicates a zero-inflation in the data. In such cases, it is recommended to use negative binomial or zero-inflated models.

In case of negative binomial models, models with zero-inflation component, or hurdle models, the results from check_zeroinflation() are based on simulate_residuals(), i.e. check_zeroinflation(simulate_residuals(model)) is internally called if necessary.

Value

A list with information about the amount of predicted and observed zeros in the outcome, as well as the ratio between these two values.

Tests based on simulated residuals

For certain models, resp. model from certain families, tests are based on simulated residuals (see simulate_residuals()). These are usually more accurate for testing such models than the traditionally used Pearson residuals. However, when simulating from more complex models, such as mixed models or models with zero-inflation, there are several important considerations. Arguments specified in ... are passed to simulate_residuals(), which relies on DHARMa::simulateResiduals() (and therefore, arguments in ... are passed further down to DHARMa). The defaults in DHARMa are set on the most conservative option that works for all models. However, in many cases, the help advises to use different settings in particular situations or for particular models. It is recommended to read the 'Details' in ?DHARMa::simulateResiduals closely to understand the implications of the simulation process and which arguments should be modified to get the most accurate results.

Examples


data(Salamanders, package = "glmmTMB")
m <- glm(count ~ spp + mined, family = poisson, data = Salamanders)
check_zeroinflation(m)

# for models with zero-inflation component, it's better to carry out
# the check for zero-inflation using simulated residuals
m <- glmmTMB::glmmTMB(
  count ~ spp + mined,
  ziformula = ~ mined + spp,
  family = poisson,
  data = Salamanders
)
res <- simulate_residuals(m)
check_zeroinflation(res)

[Package performance version 0.12.2 Index]