bizicount {bizicount}  R Documentation 
The main bivariate regression function of the bizicountpackage
Estimates copulabased bivariate zeroinflated (and noninflated)
count models via maximum likelihood. Supports the Frank and Gaussian
copulas, as well as zeroinflated Poisson and negative binomial margins
(and their noninflated counterparts). It's class has associated
simulate
methods for postestimation diagnostics using
the DHARMa
package, as well as an
extract
method for printing professional tables using
texreg
, and a test for zeromodification using zi_test
.
See the 'See Also' section for links to these methods.
bizicount(
fmla1,
fmla2,
data,
cop = "gaus",
margins = c("pois", "pois"),
link.ct = c("log", "log"),
link.zi = c("logit", "logit"),
scaling = "none",
starts = NULL,
keep = TRUE,
subset,
na.action,
weights,
frech.min = 1e07,
pmf.min = 1e07,
...
)
fmla1, fmla2 

data 
A 
cop 
Character string specifying the copula to be used. One of

margins 
Length 2 character vector specifying the marginal
distributions for each outcome. Each of the two elements must be one of

link.ct 
Length 2 character string specifying the link function used
for the count portion of each margin. One of 
link.zi 
Length 2 character string specifying the link function used
for the zeroinflation portion of each margin. One of 
scaling 
Character string (partial matching supported) indicating what
type of scaling to apply to covariates (if any). One of
Factor variables, offsets, and the intercept are not scaled. The names of variables that have been
scaled are returned as part of the 
starts 
Numeric vector of starting values for parameter estimates. See
'Details' section regarding the correct order for the values in this vector.
If 
keep 
Logical indicating whether to keep the model matrix in the
returned model object. Defaults to 
subset 
A vector indicating the subset of observations to use in estimation. 
na.action 
A function which indicates what should happen when the data
contain NAs. Default is 
weights 
An optional numeric vector of weights for each observation. 
frech.min 
Lower boundary for FrechetHoeffding bounds on copula CDF.
Used for computational purposes to prevent over/underflow in likelihood
search. Must be in 
pmf.min 
Lower boundary on copula PMF evaluations. Used for
computational purposes to prevent over/underflow in likelihood search. Must
be in 
... 
Additional arguments to be passed on to the quasinewton fitting
function, 
starts
– Starting values should be organized as
follows:
count parameters for margin 1
count parameters for margin 2
zeroinflated parameters for margin 1 (if applicable),
zeroinflated parameters for margin 2 (if applicable),
inverse dispersion parameter for margin 1 (if applicable),
inverse dispersion parameter for margin 2 (if applicable)
Thus, in general count parameters should come first, followed by zeroinflation parameters, and finally inverse dispersion parameters.
frech.min
– Changing this argument should almost never be
necessary. Frechet (1951) and Hoeffding (1940) showed that copula CDFs have
bounds of the form max\{u + v  1, 0\} \le C(u, v) \le min\{u, v\}
, where
u
and v
are uniform realizations derived from the probability
integral transform. Due to numerical underflow, very small values of u
and v
can be rounded to zero. Particularly when evaluating the Gaussian
copula CDF this is problematic, ultimately leading to infinitevalued
likelihood evaluations. Therefore, we impose FrechetHoeffding bounds
numerically as max\{u + v  1, frech.min\} \le C(u, v) \le min\{u, v, 1 
frech.min\}
. NOTE: Setting this to 0 imposes the original Frechet bounds
mentioned above.
pmf.min
– Changing this argument should almost never be
necessary. Observations can have likelihoods that are extremely close to 0.
Numerically, these get rounded to 0 due to underflow. Then, taking logarithms
results in an infinite likelihood. To avoid this, we bound PMF evaluations
from below at pmf.min
.
...
– Sometimes it may be useful to alter nlm
's
default parameters. This can be done by simply passing those arguments into
bizicount()
. The two that seem to benefit the fitting process the most are
stepmax
and steptol
. Readers are referred to the documentation on
nlm
for more details on these parameters. It can be
useful to lower stepmax
particularly when the Hessian is not negative
definite at convergence, sometimes to a value between 0 and 1. It can also be
beneficial to increase steptol
.
An S3 bizicountclass
object, which is a list containing:
coef
– Coefficients of the model
coef.nid
– Coefficients without margin IDs
coef.orig
– Coefficients prior to transformations, for Gaussian
dependence and negative binomial dispersion.
coef.orig.nid
– Coefficients prior to transforms, no margin IDs.
se
– Asymptotic normaltheory standard errors based on observed Fisher Information
se.nid
– Standard errors without margin IDs
z
– zscores for parameter estimates
z.nid
– zscores without margin IDs
p
– pvalues for parameter estimates
p.nid
– pvalues without margin IDs
coefmats
– A list containing coeficient matrices for each margin
loglik
– Scalar loglikelihood at convergence
grad
– Numerical gradient vector at convergence
n.iter
– Number of quasinewton fitting iterations.
covmat
– Covariance matrix of parameter estimates based on observed Fisher Information
aic
– Model's Akaike information
bic
– Model's Bayesian information criterion
nobs
– Number of observations
margins
– Marginal distributions used in fitting
link.zi, link.ct
– Names of link functions used in fitting
invlink.ct, invlink.zi
– Inverse link functions used in fitting (the
actual function, not their names)
outcomes
– Name of the response vector
conv
– Integer telling convergence status in nlm. See ?nlm.
cop
– The copula used in fitting
starts
– list of starting values used
call
– The model's call
model
– List containing model matrices, or NULL
if keep = F
.
scaled
– Vector indicating which covariates in each margin were scaled
according to the scaling
parameter.
John Niehaus
Genest C, Nešlehová J (2007). “A primer on copulas for count data.” ASTIN Bulletin: The Journal of the IAA, 37(2), 475–515.
Inouye DI, Yang E, Allen GI, Ravikumar P (2017). “A review of multivariate distributions for count data derived from the Poisson distribution.” Wiley Interdisciplinary Reviews: Computational Statistics, 9(3).
Joe H (1997). Multivariate models and multivariate dependence concepts. CRC Press.
Nikoloulopoulos A (2013). “CopulaBased Models for Multivariate Discrete Response Data.” In P Jaworski, F Durante, WK Härdle (eds.), Copulae in Mathematical and Quantitative Finance, chapter 11, pp. 231–250. Springer.
Nelsen RB (2007). An Introduction to Copulas. Springer Science & Business Media.
Trivedi P, Zimmer D (2017). “A note on identification of bivariate copulas for discrete countdata.” Econometrics, 5(1), 10.
Trivedi PK, Zimmer DM (2007). Copula modeling: an introduction for practitioners. NowPublishers Inc.
extract.bizicount
, make_DHARMa
, zi_test
### bizicount example
## SETUP
set.seed(123)
n = 300
# define a function to simulate from a gaussian copula
# first margin is zeroinflated negative binomial (zinb)
# second margin is zeroinflated poisson (zip)
# Note: marginal distributions are hardcoded in function, including
# inverse dispersion parameter for zinb.
gen = function(n,
b1,
b2,
g1,
g2,
dep) {
k1 = length(b1)
k2 = length(b2)
X1 = cbind(1, matrix(rbinom(n * (k1  1), 1, .5), ncol = k1  1))
X2 = cbind(1, matrix(rexp(n * (k2  1), 3), ncol = k2  1))
lam1 = exp(X1 %*% b1)
lam2 = exp(X2 %*% b2)
Z1 = cbind(1, matrix(runif(n * (k1  1), 1, 1), ncol = k1  1))
Z2 = cbind(1, matrix(rnorm(n * (k2  1)), ncol = k2  1))
psi1 = plogis(Z1 %*% g1)
psi2 = plogis(Z2 %*% g2)
norm_vars = MASS::mvrnorm(
n,
mu = c(0, 0),
Sigma = matrix(c(1, dep, dep, 1), ncol =2)
)
U = pnorm(norm_vars)
y1 = qzinb(U[, 1],
mu = lam1,
psi = psi1,
size = .3)
y2 = qzip(U[, 2],
lambda = lam2,
psi = psi2)
dat = data.frame(
X1 = X1[, 1],
X2 = X2[, 1],
Z1 = Z1[, 1],
Z2 = Z2[, 1],
y1,
y2,
lam1,
lam2,
psi1,
psi2
)
return(dat)
}
# define parameters
b1 = c(1, 2, 3)
b2 = c(1, 3, 1)
g1 = c(2, 1.5, 2)
g2 = c(1, 3.75, 1.25)
rho = .5
# generate data
dat = gen(n, b1, b2, g1, g2, rho)
f1 = y1 ~ X1.1 + X1.2  Z1.1 + Z1.2
f2 = y2 ~ X2.1 + X2.2  Z2.1 + Z2.2
## END SETUP
# estimate model
mod = bizicount(f1, f2, dat, cop = "g", margins = c("zip", "zip"), keep = TRUE)
print(mod)
summary(mod)