R: Estimate of a Probability from Clustered Binomial Data

varbin {aods3}

R Documentation

Estimate of a Probability from Clustered Binomial Data

Description

The function estimates a probability and its variance from clustered binomial data

{(n_1, m_1), (n_2, m_2), ..., (n_N, m_N)},

where n_i is the size of cluster i, m_i the number of “successes” (proportions are y = m/n), and N the number of clusters. Confidence intervals are calculated using a normal approximation, which might be inappropriate when the probability is close to 0 or 1.

Usage

  varbin(n, m, alpha = 0.05, R = 5000)
  
  ## S3 method for class 'varbin'
print(x, ...)

Arguments

`n`	A vector of the sizes of the clusters.
`m`	A vector of the numbers of successes (proportions are eqny = m / n).
`alpha`	The significance level for the confidence intervals. Default to 0.05, providing 95% CI's.
`R`	The number of bootstrap replicates to compute bootstrap mean and variance. Default to 5000.
`x`	An object of class “varbin”.
`...`	Further arguments to be passed to “print”.

Details

Five methods are used for the estimations. Let us consider N clusters of sizes n_1, \ldots, n_N with observed count responses m_1, \ldots, m_N. We note y_i = m_i/n_i (i = 1, \ldots, N) the observed proportions. The underlying assumption is that the probability, say mu, is homogeneous across the clusters.

Binomial method: the probability estimate and its variance are calculated by

\mu = (sum_{i} (m_i)) / (sum_{i} (n_i)) (ratio estimate) and

\mu * (1 - \mu) / (sum_{i} (n_i) - 1), respectively.

Ratio method: the probability \mu is estimated as for the binomial method (ratio estimate). The one-stage cluster sampling formula is used to calculate the variance of \mu (see Cochran, 1999, p. 32 and p. 66).

Arithmetic method: the probability is estimated by \mu = sum_{i} (y_i) / N. The variance of \mu is estimated by sum_{i} (y_i - \mu)^2 / (N * (N - 1)).

Jackknife method: the probability is estimated by \mu defined by the arithmetic mean of the pseudovalues y_{v,i}. The variance is estimated by sum_{i} (y_{v,i} - \mu)^2 / (N * (N - 1)) (Gladen, 1977, Paul, 1982).

Bootstrap method: R samples of clusters of size N are drawn with equal probability from the initial sample (y_1, \ldots , y_N) (Efron and Tibshirani, 1993). The bootstrap estimate \mu and its estimated variance are the arithmetic mean and the empirical variance (computed with denominator R - 1) of the R binomial ratio estimates, respectively.

Value

An object of class varbin, printed with print.varbin.

References

Cochran, W.G., 1999, 3th ed. Sampling techniques. Wiley, New York.
Efron, B., Tibshirani, R., 1993. An introduction to the bootstrap. Chapman and Hall, London.
Gladen, B., 1977. The use of the jackknife to estimate proportions from toxicological data in the presence of litter effects. JASA 74(366), 278-283.
Paul, S.R., 1982. Analysis of proportions of affected foetuses in teratological experiments. Biometrics 38, 361-370.

Examples

data(rabbits)
z <- rabbits[rabbits$group == "M", ]
varbin(z$n, z$m)
by(rabbits,
	list(group = rabbits$group),
  function(x) varbin(n = x$n, m = x$m, R = 1000))

[Package aods3 version 0.4-1.2 Index]