ci.p {asbio} | R Documentation |

## Confidence interval estimation for the binomial parameter pi using five popular methods.

### Description

Confidence interval formulae for `\mu`

are not appropriate for variables describing binary outcomes. The function `p.conf`

calculates confidence intervals for the binomial parameter `\pi`

(probability of success) using raw or summarized data. By default Agresti-Coull
point estimators are used to estimate `\pi`

and `\sigma_{\hat{\pi}}`

. If raw data are to be used (the default) then successes should be indicated as ones and failures as zeros in the `data`

vector. Finite population corrections can also be specified.

### Usage

```
ci.p(data, conf = 0.95, summarized = FALSE, phat = NULL,
fpc = FALSE, n = NULL, N = NULL, method="agresti.coull", plot = TRUE)
```

### Arguments

`data` |
A vector of binary data. Required if |

`conf` |
Level of confidence 1 - |

`summarized` |
Logical; indicate whether raw data or summary stats are to be used. |

`phat` |
Estimate of |

`fpc` |
Logical. Indicates whether finite population corrections should be used. If |

`n` |
Sample size. Required if |

`N` |
Population size. Required if |

`method` |
Type of method to be used in confidence interval calculations, |

`plot` |
Logical. Should likelihood ratio plot be created with estimate from |

### Details

For the binomial distribution, the parameter of interest is the probability of success, `\pi`

. ML estimators for the parameter, `\pi`

, and its standard deviation, `\sigma_\pi`

are:

`\hat{\pi}=\frac{x}{n},`

`\hat{\sigma}_{\hat{\pi}}=\sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{n}}`

where *x* is the number of successes and *n* is the number of observations.

Because the sampling distribution of any ML estimator is asymptotically normal, an "asymptotic" 100(1 - `\alpha`

)% confidence interval for `\pi`

is found using:

`\hat{\pi}\pm z_{1-(\alpha/2)}\hat{\sigma}_{\hat{\pi}}.`

This method has also been called the Wald confidence interval.

These estimators can create extremely inaccurate confidence intervals, particularly for small sample sizes or when `\pi`

is near 0 or 1 (Agresti 2012). A better method is to invert the Wald binomial test statistic and vary values for `\pi_0`

in the test statistic numerator and standard error. The interval consists of values of `\pi_0`

in which result in a failure to reject null at `\alpha`

. Bounds can be obtained by finding the roots of a quadratic expansion of the binomial likelihood function (See Agresti 2012).
This has been called a "score" confidence interval (Agresti 2012). An simple approximation to this method can be obtained by adding `z_{1-(\alpha/2)} (\approx 2`

for `\alpha = 0.05`

) to the number of successes and failures (Agresti and Coull 1998). The resulting Agresti-Coull estimators for `\pi`

and `\sigma_{\hat{\pi}}`

are:

`\hat{\pi}=\frac{x+z^2/2}{n+z^2},`

`\hat{\sigma}_{\hat{\pi}}=\sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{n+z^2}}.`

where `z`

is the standard normal inverse cdf at probability 1 - `\alpha/2`

.

As above, the 100(1 - `\alpha`

)% confidence interval for the binomial parameter `\pi`

is found using:

`\hat{\pi}\pm z_{1-(\alpha/2)}\hat{\sigma}_{\hat{\pi}}.`

The likelihood ratio method `method = "LR"`

finds points in the binomial log-likelihood function where the difference between the maximum likelihood and likelihood function is closest to `\chi_1^{2}(1 - \alpha)/2`

for support given in `\pi_0`

. As support the function uses
`seq(0.00001, 0.99999, by = 0.00001)`

.

The "exact" method of Clopper and Pearson (1934) is bounded at the nominal limits, but actual coverage may be well below this level, particularly when *n* is small and `\pi`

is near 0 or 1.

Agresti (2012) recommends the Agresti-Coull method over the normal approximation, the score method over the Agresti-Coull method, and the likelihood ratio method over all others. The Clopper Pearson has been repeatedly criticized as being too conservative (Agresti and Coull 2012).

### Value

Returns a list of `class = "ci"`

.

`pi.hat` |
Estimate for |

`S.p.hat` |
Estimate for |

`margin` |
Confidence margin. |

`ci` |
Confidence interval. |

### Note

This function contains only a few of the many methods that have been proposed for confidence interval estimation for `\pi`

.

### Author(s)

Ken Aho. thanks to Simon Thelwall for finding an error with summarized data under fpc.

### References

Agresti, A. (2012) *Categorical Data Analysis, 3rd edition*. New York. Wiley.

Agresti, A., and Coull, B . A. (1998) Approximate is better than 'exact' for interval
estimation of binomial proportions. *The American Statistician*. 52: 119-126.

Clopper, C. and Pearson, S. (1934) The use of confidence or fiducial limits illustrated in
the case of the Binomial. *Biometrika* 26: 404-413.

Ott, R. L., and Longnecker, M. T. (2004) *A First Course in Statistical Methods*.
Thompson.

Wilson, E. B.(1927) Probable inference, the law of succession, and statistical inference.
*Journal of the American Statistical Association* 22: 209-212.

### See Also

### Examples

```
#In 2001, it was estimated that 56,200 Americans would be diagnosed with
# non-Hodgkin's lymphoma and that 26,300 would die from it (Cernan et al. 2002).
# Here we find the 95% confidence interval for the probability of diagnosis, pi.
ci.p(c(rep(0, 56200-26300),rep(1,26300))) # Agresti-Coull
ci.p(c(rep(0, 56200-26300),rep(1,26300)), method = "LR") # Likelihood Ratio
# summarized = TRUE
n = 56200
x = 26300
phat = x/n
ci.p(summarized = TRUE, phat = phat, n = n) # Agresti-Coull
# Use 2001 US population size as N
N <- 285 * 10^6
ci.p(c(rep(0, 56200-26300),rep(1,26300)), fpc = TRUE, N = N) # Agresti-Coull
ci.p(summarized = TRUE, phat = phat, n = n, N = N, fpc = TRUE) # Agresti-Coull
```

*asbio*version 1.9-7 Index]