ciNormN {EnvStats}  R Documentation 
Compute the sample size necessary to achieve a specified halfwidth of a confidence interval for the mean of a normal distribution or the difference between two means, given the estimated standard deviation and confidence level.
ciNormN(half.width, sigma.hat = 1, conf.level = 0.95,
sample.type = ifelse(is.null(n2), "one.sample", "two.sample"),
n2 = NULL, round.up = TRUE, n.max = 5000, tol = 1e07, maxiter = 1000)
half.width 
numeric vector of (positive) halfwidths.
Missing ( 
sigma.hat 
numeric vector specifying the value(s) of the estimated standard deviation(s). 
conf.level 
numeric vector of numbers between 0 and 1 indicating the confidence level
associated with the confidence interval(s). The default value is 
sample.type 
character string indicating whether this is a onesample 
n2 
numeric vector of sample sizes for group 2. The default value is 
round.up 
logical scalar indicating whether to round up the values of the computed sample size(s)
to the next smallest integer. The default value is 
n.max 
positive integer greater than 1 specifying the maximum sample size for the single
group when 
tol 
numeric scalar indicating the tolerance to use in the 
maxiter 
positive integer indicating the maximum number of iterations to use in the

If the arguments half.width
, n2
, sigma.hat
, and
conf.level
are not all the same length, they are replicated to be the same length as
the length of the longest argument.
The function ciNormN
uses the formulas given in the help file for
ciNormHalfWidth
for the halfwidth of the confidence interval
to iteratively solve for the sample size. For the twosample case, the default
is to assume equal sample sizes for each group unless the argument n2
is supplied.
When sample.type="one.sample"
, or sample.type="two.sample"
and n2
is not supplied (so equal sample sizes for each group is assumed),
the function ciNormN
returns a numeric vector of sample sizes.
When sample.type="two.sample"
and n2
is supplied,
the function ciNormN
returns a list with two components called n1
and n2
,
specifying the sample sizes for each group.
The normal distribution and lognormal distribution are probably the two most frequently used distributions to model environmental data. In order to make any kind of probability statement about a normallydistributed population (of chemical concentrations for example), you have to first estimate the mean and standard deviation (the population parameters) of the distribution. Once you estimate these parameters, it is often useful to characterize the uncertainty in the estimate of the mean. This is done with confidence intervals.
In the course of designing a sampling program, an environmental scientist may wish to determine
the relationship between sample size, confidence level, and halfwidth if one of the objectives
of the sampling program is to produce confidence intervals. The functions
ciNormHalfWidth
, ciNormN
, and plotCiNormDesign
can be used to investigate these relationships for the case of normallydistributed observations.
Steven P. Millard (EnvStats@ProbStatInfo.com)
Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Second Edition. Lewis Publishers, Boca Raton, FL.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, NY.
Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY, Chapter 7.
Millard, S.P., and N. Neerchal. (2001). Environmental Statistics with SPLUS. CRC Press, Boca Raton, FL.
Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C. p.213.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. PrenticeHall, Upper Saddle River, NJ, Chapters 7 and 8.
ciNormHalfWidth
, plotCiNormDesign
, Normal
,
enorm
, t.test
,
Estimating Distribution Parameters.
# Look at how the required sample size for a onesample
# confidence interval decreases with increasing halfwidth:
seq(0.25, 1, by = 0.25)
#[1] 0.25 0.50 0.75 1.00
ciNormN(half.width = seq(0.25, 1, by = 0.25))
#[1] 64 18 10 7
ciNormN(seq(0.25, 1, by=0.25), round = FALSE)
#[1] 63.897899 17.832337 9.325967 6.352717
#
# Look at how the required sample size for a onesample
# confidence interval increases with increasing estimated
# standard deviation for a fixed halfwidth:
seq(0.5, 2, by = 0.5)
#[1] 0.5 1.0 1.5 2.0
ciNormN(half.width = 0.5, sigma.hat = seq(0.5, 2, by = 0.5))
#[1] 7 18 38 64
#
# Look at how the required sample size for a onesample
# confidence interval increases with increasing confidence
# level for a fixed halfwidth:
seq(0.5, 0.9, by = 0.1)
#[1] 0.5 0.6 0.7 0.8 0.9
ciNormN(half.width = 0.25, conf.level = seq(0.5, 0.9, by = 0.1))
#[1] 9 13 19 28 46
#
# Modifying the example on pages 214 to 215 of USEPA (2009),
# determine the required sample size in order to achieve a
# halfwidth that is 10% of the observed mean (based on the first
# four months of observations) for the Aldicarb level at the first
# compliance well. Assume a 95% confidence level and use the
# estimated standard deviation from the first four months of data.
# (The data are stored in EPA.09.Ex.21.1.aldicarb.df.)
#
# The required sample size is 20, so almost two years of data are
# required assuming observations are taken once per month.
EPA.09.Ex.21.1.aldicarb.df
# Month Well Aldicarb.ppb
#1 1 Well.1 19.9
#2 2 Well.1 29.6
#3 3 Well.1 18.7
#4 4 Well.1 24.2
#...
mu.hat < with(EPA.09.Ex.21.1.aldicarb.df,
mean(Aldicarb.ppb[Well=="Well.1"]))
mu.hat
#[1] 23.1
sigma.hat < with(EPA.09.Ex.21.1.aldicarb.df,
sd(Aldicarb.ppb[Well=="Well.1"]))
sigma.hat
#[1] 4.93491
ciNormN(half.width = 0.1 * mu.hat, sigma.hat = sigma.hat)
#[1] 20
#
# Clean up
rm(mu.hat, sigma.hat)