R: Positive Bernoulli Family Function with Time Effects

posbernoulli.t {VGAM}

R Documentation

Positive Bernoulli Family Function with Time Effects

Description

Fits a GLM/GAM-like model to multiple Bernoulli responses where each row in the capture history matrix response has at least one success (capture). Sampling occasion effects are accommodated.

Usage

posbernoulli.t(link = "logitlink", parallel.t = FALSE ~ 1,
    iprob = NULL, p.small = 1e-4, no.warning = FALSE,
    type.fitted = c("probs", "onempall0"))

Arguments

`link`, `iprob`, `parallel.t`	See `CommonVGAMffArguments` for information. By default, the parallelism assumption does not apply to the intercept. Setting `parallel.t = FALSE ~ -1`, or equivalently `parallel.t = FALSE ~ 0`, results in the `M_0`/`M_h` model.
`p.small`, `no.warning`	A small probability value used to give a warning for the Horvitz–Thompson estimator. Any estimated probability value less than `p.small` will result in a warning, however, setting `no.warning = TRUE` will suppress this warning if it occurs. This is because the Horvitz-Thompson estimator is the sum of the reciprocal of such probabilities, therefore any probability that is too close to 0 will result in an unstable estimate.
`type.fitted`	See `CommonVGAMffArguments` for information. The default is to return a matrix of probabilities. If `"onempall0"` is chosen then the the probability that each animal is captured at least once in the course of the study is returned. The abbreviation stands for one minus the probability of all 0s, and the quantity appears in the denominator of the usual formula.

Details

These models (commonly known as M_t or M_{th} (no prefix h means it is an intercept-only model) in the capture–recapture literature) operate on a capture history matrix response of 0s and 1s (n \times \tau). Each column is a sampling occasion where animals are potentially captured (e.g., a field trip), and each row is an individual animal. Capture is a 1, else a 0. No removal of animals from the population is made (closed population), e.g., no immigration or emigration. Each row of the response matrix has at least one capture. Once an animal is captured for the first time, it is marked/tagged so that its future capture history can be recorded. Then it is released immediately back into the population to remix. It is released immediately after each recapture too. It is assumed that the animals are independent and that, for a given animal, each sampling occasion is independent. And animals do not lose their marks/tags, and all marks/tags are correctly recorded.

The number of linear/additive predictors is equal to the number of sampling occasions, i.e., M = \tau, say. The default link functions are (logit \,p_{1},\ldots,logit \,p_{\tau})^T where each p_{j} denotes the probability of capture at time point j. The fitted value returned is a matrix of probabilities of the same dimension as the response matrix.

A conditional likelihood is maximized here using Fisher scoring. Each sampling occasion has a separate probability that is modelled here. The probabilities can be constrained to be equal by setting parallel.t = FALSE ~ 0; then the results are effectively the same as posbinomial except the binomial constants are not included in the log-likelihood. If parallel.t = TRUE ~ 0 then each column should have at least one 1 and at least one 0.

It is well-known that some species of animals are affected by capture, e.g., trap-shy or trap-happy. This VGAM family function does not allow any behavioral effect to be modelled (posbernoulli.b and posbernoulli.tb do) because the denominator of the likelihood function must be free of behavioral effects.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Upon fitting the extra slot has a (list) component called N.hat which is a point estimate of the population size N (it is the Horvitz-Thompson (1952) estimator). And there is a component called SE.N.hat containing its standard error.

Note

The weights argument of vglm need not be assigned, and the default is just a matrix of ones.

Fewer numerical problems are likely to occur for parallel.t = TRUE. Data-wise, each sampling occasion may need at least one success (capture) and one failure. Less stringent conditions in the data are needed when parallel.t = TRUE. Ditto when parallelism is applied to the intercept too.

The response matrix is returned unchanged; i.e., not converted into proportions like posbinomial. If the response matrix has column names then these are used in the labelling, else prob1, prob2, etc. are used.

Using AIC() or BIC() to compare posbernoulli.t, posbernoulli.b, posbernoulli.tb models with a posbinomial model requires posbinomial(omit.constant = TRUE) because one needs to remove the normalizing constant from the log-likelihood function. See posbinomial for an example.

Author(s)

Thomas W. Yee.

References

Huggins, R. M. (1991). Some practical aspects of a conditional likelihood approach to capture experiments. Biometrics, 47, 725–732.

Huggins, R. M. and Hwang, W.-H. (2011). A review of the use of conditional likelihood in capture–recapture experiments. International Statistical Review, 79, 385–400.

Otis, D. L. and Burnham, K. P. and White, G. C. and Anderson, D. R. (1978). Statistical inference from capture data on closed animal populations, Wildlife Monographs, 62, 3–135.

Yee, T. W. and Stoklosa, J. and Huggins, R. M. (2015). The VGAM package for capture–recapture data using the conditional likelihood. Journal of Statistical Software, 65, 1–33. doi:10.18637/jss.v065.i05.

Examples

M.t <- vglm(cbind(y1, y2, y3, y4, y5, y6) ~ 1, posbernoulli.t,
            data = deermice, trace = TRUE)
coef(M.t, matrix = TRUE)
constraints(M.t, matrix = TRUE)
summary(M.t, presid = FALSE)

M.h.1 <- vglm(Select(deermice, "y") ~ sex + weight, trace = TRUE,
              posbernoulli.t(parallel.t = FALSE ~ -1), deermice)
coef(M.h.1, matrix = TRUE)
constraints(M.h.1)
summary(M.h.1, presid = FALSE)
head(depvar(M.h.1))  # Response capture history matrix
dim(depvar(M.h.1))

M.th.2 <- vglm(cbind(y1, y2, y3, y4, y5, y6) ~ sex + weight,
               posbernoulli.t(parallel.t = FALSE), deermice)
# Test the parallelism assumption wrt sex and weight:
lrtest(M.h.1, M.th.2)
coef(M.th.2)
coef(M.th.2, matrix = TRUE)
constraints(M.th.2)
summary(M.th.2, presid = FALSE)
head(model.matrix(M.th.2, type = "vlm"), 21)

M.th.2@extra$N.hat  # Population size estimate; should be about N
M.th.2@extra$SE.N.hat  # SE of the estimate of the population size
# An approximate 95 percent confidence interval:
round(M.th.2@extra$N.hat + c(-1, 1)*1.96* M.th.2@extra$SE.N.hat, 1)

# Fit a M_h model, effectively the parallel M_t model:
deermice <- transform(deermice, ysum = y1 + y2 + y3 + y4 + y5 + y6,
                                tau  = 6)
M.h.3 <- vglm(cbind(ysum, tau - ysum) ~ sex + weight,
              posbinomial(omit.constant = TRUE), data = deermice)
max(abs(coef(M.h.1) - coef(M.h.3)))  # Should be zero
# Difference is due to the binomial constants:
logLik(M.h.3) - logLik(M.h.1)

[Package VGAM version 1.1-11 Index]