R: Univariate zero-inflated Poisson and negative binomial...

zic.reg {bizicount}

R Documentation

Univariate zero-inflated Poisson and negative binomial regression models

Description

This function from the bizicount package estimates univariate zero-inflated Poisson and negative binomial regression models via maximum likelihood using either the nlm or optim optimization functions. It's class has associated simulate methods for post-estimation diagnostics using the DHARMa package, as well as an extract method for printing professional tables using texreg. Visit the 'See Also' section for links to these methods for zicreg objects.

Usage

zic.reg(
  fmla = NULL,
  data,
  dist = "pois",
  link.ct = "log",
  link.zi = "logit",
  optimizer = "nlm",
  starts = NULL,
  subset,
  na.action,
  weights = rep(1, length(y)),
  X = NULL,
  z = NULL,
  y = NULL,
  offset.ct = NULL,
  offset.zi = NULL,
  warn.parent = T,
  keep = F,
  ...
)

Arguments

`fmla`	A `formula` of the form `⁠y ~ x_1 + x_2 + ... + x_n + offset(count_var) \| z_1 + ... z_n + offset(zi_var)⁠`, where the `x` values are covariates in the count portion of the model, and `z` are in the zero-inflation portion. The `z` and `x` variables can be the same. If `NULL`, design matrices, the response vector, and offsets can be entered directly; see `X`, `z`, `y`, `offset.ct`, and `offset.zi` below.
`data`	A `data.frame` containing all variables appearing in `fmla`, including offsets. If not specified, variables are searched for in parent environment.
`dist`	The distribution used for the count portion of the zero-inflated mixture. One of `c("pois", "nbinom")`, partial matching supported.
`link.ct`	String specifying the link function used for the count portion of the mixture distribution. One of `c("log", "identity", "sqrt")`. See `family`.
`link.zi`	Character string specifying the link function used for the zero-inflation portion of the mixture distribution. One of `c("logit", "probit", "cauchit", "log", "cloglog")`. See `family`.
`optimizer`	String specifying the optimizer to be used for fitting, one of `c("nlm", "optim")`. If `"optim"`, defaults to `method="BFGS"`.
`starts`	Optional vector of starting values used for the numerical optimization procedure. Should have count parameters first (with intercept first, if applicable), followed by zero-inflated parameters (with intercept first, if applicable), and the inverse dispersion parameter last (if applicable).
`subset`	Vector indicating the subset of observations on which to estimate the model
`na.action`	A function which indicates what should happen when the data contain NAs. Default is `na.omit`.
`weights`	An optional numeric vector of weights for each observation.
`X`, `z`	If `fmla = NULL`, these are the design matrices of covariates for the count and zero-inflation portions, respectively. Both require no missingness. Similar in spirit to `glm.fit` in that it can be faster for larger datasets because it bypasses model matrix creation.
`y`	If `fmla = NULL`, a vector containing the response variable.
`offset.ct`, `offset.zi`	If `fmla = NULL`, vectors containing the (constant) offset for the count and zero-inflated portions, respectively. Must be equal in length to `y`, and row-dim of `X`, `z`. If left `NULL`, defaults to `rep(0, length(y))`.
`warn.parent`	Logical indicating whether to warn about `data` not being supplied.
`keep`	Logical indicating whether to keep the model matrices in the returned model object. Must be `TRUE` to use `DHARMa` and `texreg` with the model object, e.g., via `simulate.zicreg` and `extract.zicreg`, as well as base generics like `fitted` and `predict`.
`...`	Additional arguments to pass on to the chosen optimizer, either `nlm` or `optim`. See 'Examples'.

Value

An S3 zicreg-class object, which is a list containing:

call – The original function call
obj – The class of the object
coef – Vector of coefficients, with count, then zi, then dispersion.
se – Vector of asymptotic standard errors
grad – Gradient vector at convergence
link.ct – Name of link used for count portion
link.zi – Name of link used for zero-inflated portion
dist – Name of distribution used for count portion
optimizer – Name of optimization package used in fitting
coefmat.ct – Coefficient matrix for count portion
coefmat.zi – Coefficient matrix for zero-inflated portion
convergence – Convergence code from optimization routine.
coefmat.all – Coefficient matrix for both parts of the model
theta – Coefficient matrix for dispersion, if applicable.
covmat – Asymptotic covariance matrix
nobs – Number of observations
aic – Akaike information
bic – Bayes information
loglik – Log-likelihood at convergence
model – List containing model matrices if keep = TRUE

Author(s)

John Niehaus

References

Lambert, Diane. "Zero-inflated Poisson regression, with an application to defects in manufacturing." Technometrics 34.1 (1992): 1-14.

Examples

## ZIP example
# Simulate some zip data
n=1000
x = cbind(1, rnorm(n))
z = cbind(1, rbeta(n, 4, 8))
b = c(1, 2.2)
g = c(-1, 1.7)
lam = exp(x %*% b)
psi = plogis(z %*% g)

y = bizicount::rzip(n, lambda = lam, psi=psi)
dat = cbind.data.frame(x = x[,-1], z = z[,-1], y = y)

# estimate zip model using NLM, no data.frame

mod = zic.reg(y ~ x[,-1] | z[,-1])

# same model, with dataframe

mod = zic.reg(y ~ x | z, data = dat)


# estimate zip using NLM, adjust stepmax via ... param

mod = zic.reg(y ~ x[,-1] | z[,-1], stepmax = .5)


# estimate zip using optim

mod = zic.reg(y ~ x[,-1] | z[,-1], optimizer = "optim")


# pass different method, reltol to optim using ... param

mod = zic.reg(y ~ x[,-1] | z[,-1],
        optimizer = "optim",
        method = "Nelder-Mead",
        control = list(reltol = 1e-10)
        )

# No formula, specify design matrices and offsets.
zic.reg(y=y, X=x, z=z)



## ZINB example
# simulate zinb data

disp = .5
y = bizicount::rzinb(n, psi = psi, size = disp, mu=lam)


# zinb model, use keep = TRUE for post-estimation methods

mod = zic.reg(y ~ x[,-1] | z[,-1], dist = "n", keep = TRUE)

print(mod)
summary(mod)

[Package bizicount version 1.3.3 Index]