R: Test for zero-excess in Count Regression Models

zero.excess {glmtoolbox}

R Documentation

Test for zero-excess in Count Regression Models

Description

Allows to assess if the observed number of zeros is significantly higher than expected according to the fitted count regression model (poisson or negative binomial).

Usage

zero.excess(
  object,
  alternative = c("excess", "lack", "both"),
  method = c("boot", "naive"),
  rep = 100,
  verbose = TRUE
)

Arguments

`object`	an object of the class `glm`, for poisson regression models, or an object of the class `overglm`, for negative binomial regression models.
`alternative`	an (optional) character string indicating the alternative hypothesis. There are three options: excess of zeros ("excess"), lack of zeros ("lack"), and both ("both"). As a default, `type` is set to "excess".
`method`	an (optional) character string indicating the method to calculate the mean and variance of the difference between observed and estimated expected number of zeros. There are two options: parametric bootstrapping ("boot") and naive ("naive"). As a default, `type` is set to "boot".
`rep`	an (optional) positive integer which allows to specify the number of replicates which should be used by the parametric bootstrapping. As a default, `rep` is set to 100.
`verbose`	an (optional) logical switch indicating if should the report of results be printed. As a default, `verbose` is set to TRUE.

Details

According to the formulated count regression model, we have that Y_i\sim P(y;\mu_i,\phi) for i=1,\ldots,n are independent variables. Consequently, the expected number of zeros can be estimated by P(0;\hat{\mu}_i,\hat{\phi}) for i=1,\ldots,n, where \hat{\mu}_i and \hat{\phi} represent the estimates of \mu_i and \phi, respectively, obtained from the fitted model. Thus, the statistical test can be defined as the standardized difference between the observed and (estimated) expected number of zeros. The standard normal distribution tends to be the distribution of that statistic when the sample size, n, tends to infinity. In He, Zhang, Ye, and Tang (2019), the above approach is called a naive test since it ignores the sampling variation associated with the estimated model parameters. To correct this, parametric bootstrapping is used to estimate the mean and variance of the difference between the (estimated) expected and observed number of zeros.

Value

A matrix with 1 row and the following columns:

`Observed`	the observed number of zeros,

`Expected`	the expected number of zeros,

`z-value`	the value of the statistical test,

`p.value`	the p-value of the statistical test.

References

He Hua, Zhang Hui, Ye Peng, Tang Wan (2019) A test of inflated zeros for Poisson regression models, Statistical Methods in Medical Research 28, 1157-1169.

Examples

####### Example 1: Self diagnozed ear infections in swimmers
data(swimmers)
fit1 <- glm(infections ~ frequency + location, family=poisson, data=swimmers)
zero.excess(fit1,rep=50)
fit2 <- overglm(infections ~ frequency + location, family="nb1", data=swimmers)
zero.excess(fit2,rep=50)

####### Example 2: Article production by graduate students in biochemistry PhD programs
bioChemists <- pscl::bioChemists
fit1 <- glm(art ~ fem + kid5 + ment, family=poisson, data=bioChemists)
zero.excess(fit1,rep=50)
fit2 <- overglm(art ~ fem + kid5 + ment, family="nb1", data=bioChemists)
zero.excess(fit2,rep=50)
####### Example 3: Roots Produced by the Columnar Apple Cultivar Trajan
data(Trajan)
fit1 <- glm(roots ~ photoperiod, family=poisson, data=Trajan)
zero.excess(fit1,rep=50)
fit2 <- overglm(roots ~ photoperiod, family="nbf", data=Trajan)
zero.excess(fit2,rep=50)

[Package glmtoolbox version 0.1.12 Index]