ci.p {asbio} | R Documentation |

Confidence interval formulae for *μ* are not appropriate for variables describing binary outcomes. The function `p.conf`

calculates confidence intervals for the binomial parameter *π* (probability of success) using raw or summarized data. By default Agresti-Coull
point estimators are used to estimate *π* and *σ_{\hat{π}}*. If raw data are to be used (the default) then successes should be indicated as ones and failures as zeros in the `data`

vector. Finite population corrections can also be specified.

ci.p(data, conf = 0.95, summarized = FALSE, phat = NULL, fpc = FALSE, n = NULL, N = NULL, method="agresti.coull", plot = TRUE)

`data` |
A vector of binary data. Required if |

`conf` |
Level of confidence 1 - |

`summarized` |
Logical; indicate whether raw data or summary stats are to be used. |

`phat` |
Estimate of |

`fpc` |
Logical. Indicates whether finite population corrections should be used. If |

`n` |
Sample size. Required if |

`N` |
Population size. Required if |

`method` |
Type of method to be used in confidence interval calculations, |

`plot` |
Logical. Should likelihood ratio plot be created with estimate from |

For the binomial distribution, the parameter of interest is the probability of success, *π*. ML estimators for the parameter, *π*, and its standard deviation, *σ_π* are:

*\hat{π}=\frac{x}{n},*

*\hat{σ}_{\hat{π}}=√{\frac{\hat{π}(1-\hat{π})}{n}}*

where *x* is the number of successes and *n* is the number of observations.

Because the sampling distribution of any ML estimator is asymptotically normal, an "asymptotic" 100(1 - *α*)% confidence interval for *π* is found using:

*\hat{π}\pm z_{1-(α/2)}\hat{σ}_{\hat{π}}.*

This method has also been called the Wald confidence interval.

These estimators can create extremely inaccurate confidence intervals, particularly for small sample sizes or when *π* is near 0 or 1 (Agresti 2012). A better method is to invert the Wald binomial test statistic and vary values for *π_0* in the test statistic numerator and standard error. The interval consists of values of *π_0*
in which result in a failure to reject null at *α*. Bounds can be obtained by finding the roots of a quadratic expansion of the binomial likelihood function (See Agresti 2012).
This has been called a "score" confidence interval (Agresti 2012). An simple approximation to this method can be obtained by adding *z_{1-(α/2)} (\approx 2* for *α = 0.05*) to the number of successes and failures (Agresti and Coull 1998). The resulting Agresti-Coull estimators for *π* and *σ_{\hat{π}}* are:

*\hat{π}=\frac{x+z^2/2}{n+z^2},*

*\hat{σ}_{\hat{π}}=√{\frac{\hat{π}(1-\hat{π})}{n+z^2}}.*

where *z* is the standard normal inverse cdf at probability 1 - *α/2*.

As above, the 100(1 - *α*)% confidence interval for the binomial parameter *π* is found using:

*\hat{π}\pm z_{1-(α/2)}\hat{σ}_{\hat{π}}.*

The likelihood ratio method `method = "LR"`

finds points in the binomial log-likelihood function where the difference between the maximum likelihood and likelihood function is closest to *χ_1^{2}(1 - α)/2*
for support given in *π_0*. As support the function uses
`seq(0.00001, 0.99999, by = 0.00001)`

.

The "exact" method of Clopper and Pearson (1934) is bounded at the nominal limits, but actual coverage may be well below this level, particularly when *n* is small and *π* is near 0 or 1.

Agresti (2012) recommends the Agresti-Coull method over the normal approximation, the score method over the Agresti-Coull method, and the likelihood ratio method over all others. The Clopper Pearson has been repeatedly criticized as being too conservative (Agresti and Coull 2012).

Returns a list of `class = "ci"`

.

`pi.hat` |
Estimate for |

`S.p.hat` |
Estimate for |

`margin` |
Confidence margin. |

`ci` |
Confidence interval. |

This function contains only a few of the many methods that have been proposed for confidence interval estimation for *π*.

Ken Aho. thanks to Simon Thelwall for finding an error with summarized data under fpc.

Agresti, A. (2012) *Categorical Data Analysis, 3rd edition*. New York. Wiley.

Agresti, A., and Coull, B . A. (1998) Approximate is better than 'exact' for interval
estimation of binomial proportions. *The American Statistician*. 52: 119-126.

Clopper, C. and Pearson, S. (1934) The use of confidence or fiducial limits illustrated in
the case of the Binomial. *Biometrika* 26: 404-413.

Ott, R. L., and Longnecker, M. T. (2004) *A First Course in Statistical Methods*.
Thompson.

Wilson, E. B.(1927) Probable inference, the law of succession, and statistical inference.
*Journal of the American Statistical Association* 22: 209-212.

#In 2001, it was estimated that 56,200 Americans would be diagnosed with # non-Hodgkin's lymphoma and that 26,300 would die from it (Cernan et al. 2002). # Here we find the 95% confidence interval for the probability of diagnosis, pi. ci.p(c(rep(0, 56200-26300),rep(1,26300))) # Agresti-Coull ci.p(c(rep(0, 56200-26300),rep(1,26300)), method = "LR") # Likelihood Ratio # summarized = TRUE n = 56200 x = 26300 phat = x/n ci.p(summarized = TRUE, phat = phat, n = n) # Agresti-Coull # Use 2001 US population size as N N <- 285 * 10^6 ci.p(c(rep(0, 56200-26300),rep(1,26300)), fpc = TRUE, N = N) # Agresti-Coull ci.p(summarized = TRUE, phat = phat, n = n, N = N, fpc = TRUE) # Agresti-Coull

[Package *asbio* version 1.7 Index]