plotCiBinomDesign {EnvStats}  R Documentation 
Create plots for a sampling design based on a confidence interval for a binomial proportion or the difference between two proportions.
plotCiBinomDesign(x.var = "n", y.var = "half.width",
range.x.var = NULL, n.or.n1 = 25, p.hat.or.p1.hat = 0.5,
n2 = n.or.n1, p2.hat = 0.4, ratio = 1, half.width = 0.05,
conf.level = 0.95, sample.type = "one.sample", ci.method = "score",
correct = TRUE, warn = TRUE, n.or.n1.min = 2,
n.or.n1.max = 10000, tol.half.width = 0.005, tol.p.hat = 0.005,
maxiter = 10000, plot.it = TRUE, add = FALSE, n.points = 100,
plot.col = 1, plot.lwd = 3 * par("cex"), plot.lty = 1,
digits = .Options$digits,
main = NULL, xlab = NULL, ylab = NULL, type = "l", ...)
x.var 
character string indicating what variable to use for the xaxis. Possible values are

y.var 
character string indicating what variable to use for the yaxis. Possible values are

range.x.var 
numeric vector of length 2 indicating the range of the xvariable to use for the plot.
The default value depends on the value of 
n.or.n1 
numeric scalar indicating the sample size. The default value is 
p.hat.or.p1.hat 
numeric scalar indicating an estimated proportion. 
n2 
numeric scalar indicating the sample size for group 2. The default value is the value of 
p2.hat 
numeric scalar indicating the estimated proportion for group 2.
Missing ( 
ratio 
numeric vector indicating the ratio of sample size in group 2 to sample size in group 1 ( 
half.width 
positive numeric scalar indicating the halfwidth of the confidence interval.
The default value is 
conf.level 
a numeric scalar between 0 and 1 indicating the confidence level associated with the confidence intervals.
The default value is 
sample.type 
character string indicating whether this is a onesample or twosample confidence interval.
When 
ci.method 
character string indicating which method to use to construct the confidence interval.
Possible values are 
correct 
logical scalar indicating whether to use the continuity correction when 
warn 
logical scalar indicating whether to issue a warning when 
n.or.n1.min 
for the case when 
n.or.n1.max 
for the case when 
tol.half.width 
for the case when 
tol.p.hat 
for the case when 
maxiter 
for the case when 
plot.it 
a logical scalar indicating whether to create a plot or add to the existing plot
(see description of the argument 
add 
a logical scalar indicating whether to add the design plot to the existing plot
( 
n.points 
a numeric scalar specifying how many (x,y) pairs to use to produce the plot.
There are 
plot.col 
a numeric scalar or character string determining the color of the plotted line or points. The default value
is 
plot.lwd 
a numeric scalar determining the width of the plotted line. The default value is

plot.lty 
a numeric scalar determining the line type of the plotted line. The default value is

digits 
a scalar indicating how many significant digits to print out on the plot. The default
value is the current setting of 
main , xlab , ylab , type , ... 
additional graphical parameters (see 
See the help files for ciBinomHalfWidth
and ciBinomN
for information on how to compute a onesample confidence interval for
a single binomial proportion or a twosample confidence interval for the difference between
two proportions, how the halfwidth is computed when other quantities are fixed, and how
the sample size is computed when other quantities are fixed.
plotCiBinomDesign
invisibly returns a list with components:
x.var 
xcoordinates of the points that have been or would have been plotted 
y.var 
ycoordinates of the points that have been or would have been plotted 
The binomial distribution is used to model processes with binary
(YesNo, SuccessFailure, HeadsTails, etc.) outcomes. It is assumed that the outcome of any
one trial is independent of any other trial, and that the probability of “success”, p
,
is the same on each trial. A binomial discrete random variable X
is the number of
“successes” in n
independent trials. A special case of the binomial distribution
occurs when n=1
, in which case X
is also called a Bernoulli random variable.
In the context of environmental statistics, the binomial distribution is sometimes used to model
the proportion of times a chemical concentration exceeds a set standard in a given period of time
(e.g., Gilbert, 1987, p.143), or to compare the proportion of detects in a compliance well vs. a
background well (e.g., USEPA, 1989b, Chapter 8, p.37). (However, USEPA 2009, p.827
recommends using the Wilcoxon rank sum test (wilcox.test
) instead of
comparing proportions.)
In the course of designing a sampling program, an environmental scientist may wish to determine
the relationship between sample size, confidence level, and halfwidth if one of the objectives of
the sampling program is to produce confidence intervals. The functions ciBinomHalfWidth
,
ciBinomN
, and plotCiBinomDesign
can be used to investigate these
relationships for the case of binomial proportions.
Steven P. Millard (EnvStats@ProbStatInfo.com)
Agresti, A., and B.A. Coull. (1998). Approximate is Better than "Exact" for Interval Estimation of Binomial Proportions. The American Statistician, 52(2), 119–126.
Agresti, A., and B. Caffo. (2000). Simple and Effective Confidence Intervals for Proportions and Differences of Proportions Result from Adding Two Successes and Two Failures. The American Statistician, 54(4), 280–288.
Berthouex, P.M., and L.C. Brown. (1994). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton, FL, Chapters 2 and 15.
Cochran, W.G. (1977). Sampling Techniques. John Wiley and Sons, New York, Chapter 3.
Fisher, R.A., and F. Yates. (1963). Statistical Tables for Biological, Agricultural, and Medical Research. 6th edition. Hafner, New York, 146pp.
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. Second Edition. John Wiley and Sons, New York, Chapters 12.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, NY, Chapter 11.
Newcombe, R.G. (1998a). TwoSided Confidence Intervals for the Single Proportion: Comparison of Seven Methods. Statistics in Medicine, 17, 857–872.
Newcombe, R.G. (1998b). Interval Estimation for the Difference Between Independent Proportions: Comparison of Eleven Methods. Statistics in Medicine, 17, 873–890.
Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL, Chapter 4.
USEPA. (1989b). Statistical Analysis of GroundWater Monitoring Data at RCRA Facilities, Interim Final Guidance. EPA/530SW89026. Office of Solid Waste, U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. PrenticeHall, Upper Saddle River, NJ, Chapter 24.
ciBinomHalfWidth
, ciBinomN
,
ebinom
, binom.test
, prop.test
,
par
.
# Look at the relationship between halfwidth and sample size
# for a onesample confidence interval for a binomial proportion,
# assuming an estimated proportion of 0.5 and a confidence level of
# 95%. The jigsaw appearance of the plot is the result of using the
# score method:
dev.new()
plotCiBinomDesign()
#
# Redo the example above, but use the traditional (and inaccurate)
# Wald method.
dev.new()
plotCiBinomDesign(ci.method = "Wald")
#
# Plot sample size vs. the estimated proportion for various halfwidths,
# using a 95% confidence level and the adjusted Wald method:
# NOTE: This example takes several seconds to run so it has been
# commented out. Simply remove the pound signs (#) from in front
# of the R commands to run it.
#dev.new()
#plotCiBinomDesign(x.var = "p.hat", y.var = "n",
# half.width = 0.04, ylim = c(0, 600), main = "",
# xlab = expression(hat(p)))
#
#plotCiBinomDesign(x.var = "p.hat", y.var = "n",
# half.width = 0.05, add = TRUE, plot.col = 2)
#
#plotCiBinomDesign(x.var = "p.hat", y.var = "n",
# half.width = 0.06, add = TRUE, plot.col = 3)
#
#legend(0.5, 150, paste("HalfWidth =", c(0.04, 0.05, 0.06)),
# lty = rep(1, 3), lwd = rep(2, 3), col=1:3, bty = "n")
#
#mtext(expression(paste("Sample Size vs. ", hat(p),
# " for Confidence Interval for p")), line = 2.5, cex = 1.25)
#mtext("with Confidence=95% and Various Values of HalfWidth",
# line = 1.5, cex = 1.25)
#mtext(paste("CI Method = Score Normal Approximation",
# "with Continuity Correction"), line = 0.5)
#
# Modifying the example on pages 85 to 87 of USEPA (1989b),
# look at the relationship between halfwidth and sample size
# for a 95% confidence interval for the difference between the
# proportion of detects at the background and compliance wells.
# Use the estimated proportion of detects from the original data.
# (The data are stored in EPA.89b.cadmium.df.)
# Assume equal sample sizes at each well.
EPA.89b.cadmium.df
# Cadmium.orig Cadmium Censored Well.type
#1 0.1 0.100 FALSE Background
#2 0.12 0.120 FALSE Background
#3 BDL 0.000 TRUE Background
# ..........................................
#86 BDL 0.000 TRUE Compliance
#87 BDL 0.000 TRUE Compliance
#88 BDL 0.000 TRUE Compliance
p.hat.back < with(EPA.89b.cadmium.df,
mean(!Censored[Well.type=="Background"]))
p.hat.back
#[1] 0.3333333
p.hat.comp < with(EPA.89b.cadmium.df,
mean(!Censored[Well.type=="Compliance"]))
p.hat.comp
#[1] 0.375
dev.new()
plotCiBinomDesign(p.hat.or.p1.hat = p.hat.back,
p2.hat = p.hat.comp, digits=3)
#==========
# Clean up
#
rm(p.hat.back, p.hat.comp)
graphics.off()