ci.p {asbio} R Documentation

## Confidence interval estimation for the binomial parameter pi using five popular methods.

### Description

Confidence interval formulae for μ are not appropriate for variables describing binary outcomes. The function p.conf calculates confidence intervals for the binomial parameter π (probability of success) using raw or summarized data. By default Agresti-Coull point estimators are used to estimate π and σ_{\hat{π}}. If raw data are to be used (the default) then successes should be indicated as ones and failures as zeros in the data vector. Finite population corrections can also be specified.

### Usage


ci.p(data, conf = 0.95, summarized = FALSE, phat = NULL,
fpc = FALSE, n = NULL, N = NULL, method="agresti.coull", plot = TRUE)


### Arguments

 data A vector of binary data. Required if summarized = FALSE. conf Level of confidence 1 - P(type I error). summarized Logical; indicate whether raw data or summary stats are to be used. phat Estimate of π. Required if summarized = TRUE. fpc Logical. Indicates whether finite population corrections should be used. If fpc = TRUE then N must be specified. Finite population corrections are not possible for method = "exact" or method = "score". n Sample size. Required if summarized = TRUE. N Population size. Required if fpc = TRUE. method Type of method to be used in confidence interval calculations, method ="agresti.coull" is the default. Other procedures include method="asymptotic" which provides the conventional normal (Wald) approximation, method = "score", method = "LR", and method="exact" (see Details below). Partial names can be used. The "exact" method cannot be implemented if summarized=TRUE. plot Logical. Should likelihood ratio plot be created with estimate from method = "LR".

### Details

For the binomial distribution, the parameter of interest is the probability of success, π. ML estimators for the parameter, π, and its standard deviation, σ_π are:

\hat{π}=\frac{x}{n},

\hat{σ}_{\hat{π}}=√{\frac{\hat{π}(1-\hat{π})}{n}}

where x is the number of successes and n is the number of observations.

Because the sampling distribution of any ML estimator is asymptotically normal, an "asymptotic" 100(1 - α)% confidence interval for π is found using:

\hat{π}\pm z_{1-(α/2)}\hat{σ}_{\hat{π}}.

This method has also been called the Wald confidence interval.

These estimators can create extremely inaccurate confidence intervals, particularly for small sample sizes or when π is near 0 or 1 (Agresti 2012). A better method is to invert the Wald binomial test statistic and vary values for π_0 in the test statistic numerator and standard error. The interval consists of values of π_0 in which result in a failure to reject null at α. Bounds can be obtained by finding the roots of a quadratic expansion of the binomial likelihood function (See Agresti 2012). This has been called a "score" confidence interval (Agresti 2012). An simple approximation to this method can be obtained by adding z_{1-(α/2)} (\approx 2 for α = 0.05) to the number of successes and failures (Agresti and Coull 1998). The resulting Agresti-Coull estimators for π and σ_{\hat{π}} are:

\hat{π}=\frac{x+z^2/2}{n+z^2},

\hat{σ}_{\hat{π}}=√{\frac{\hat{π}(1-\hat{π})}{n+z^2}}.

where z is the standard normal inverse cdf at probability 1 - α/2.

As above, the 100(1 - α)% confidence interval for the binomial parameter π is found using:

\hat{π}\pm z_{1-(α/2)}\hat{σ}_{\hat{π}}.

The likelihood ratio method method = "LR" finds points in the binomial log-likelihood function where the difference between the maximum likelihood and likelihood function is closest to χ_1^{2}(1 - α)/2 for support given in π_0. As support the function uses seq(0.00001, 0.99999, by = 0.00001).

The "exact" method of Clopper and Pearson (1934) is bounded at the nominal limits, but actual coverage may be well below this level, particularly when n is small and π is near 0 or 1.

Agresti (2012) recommends the Agresti-Coull method over the normal approximation, the score method over the Agresti-Coull method, and the likelihood ratio method over all others. The Clopper Pearson has been repeatedly criticized as being too conservative (Agresti and Coull 2012).

### Value

Returns a list of class = "ci".

 pi.hat Estimate for π. S.p.hat Estimate for σ_{\hat{π}}. margin Confidence margin. ci Confidence interval.

### Note

This function contains only a few of the many methods that have been proposed for confidence interval estimation for π.

### Author(s)

Ken Aho. thanks to Simon Thelwall for finding an error with summarized data under fpc.

### References

Agresti, A. (2012) Categorical Data Analysis, 3rd edition. New York. Wiley.

Agresti, A., and Coull, B . A. (1998) Approximate is better than 'exact' for interval estimation of binomial proportions. The American Statistician. 52: 119-126.

Clopper, C. and Pearson, S. (1934) The use of confidence or fiducial limits illustrated in the case of the Binomial. Biometrika 26: 404-413.

Ott, R. L., and Longnecker, M. T. (2004) A First Course in Statistical Methods. Thompson.

Wilson, E. B.(1927) Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22: 209-212.

ci.mu.z

### Examples

#In 2001, it was estimated that 56,200 Americans would be diagnosed with
# non-Hodgkin's lymphoma and that 26,300 would die from it (Cernan et al. 2002).
# Here we find the 95% confidence interval for the probability of diagnosis, pi.

ci.p(c(rep(0, 56200-26300),rep(1,26300))) # Agresti-Coull
ci.p(c(rep(0, 56200-26300),rep(1,26300)), method = "LR") # Likelihood Ratio

# summarized = TRUE
n = 56200
x = 26300
phat = x/n

ci.p(summarized = TRUE, phat = phat, n = n) # Agresti-Coull

# Use 2001 US population size as N
N <- 285 * 10^6
ci.p(c(rep(0, 56200-26300),rep(1,26300)), fpc = TRUE, N = N) # Agresti-Coull
ci.p(summarized = TRUE, phat = phat, n = n, N = N, fpc = TRUE) # Agresti-Coull


[Package asbio version 1.7 Index]