predIntNormN {EnvStats}  R Documentation 
k
Observations from a Normal Distribution
Compute the sample size necessary to achieve a specified halfwidth of a
prediction interval for the next k
observations from a normal distribution.
predIntNormN(half.width, n.mean = 1, k = 1, sigma.hat = 1,
method = "Bonferroni", conf.level = 0.95, round.up = TRUE,
n.max = 5000, tol = 1e07, maxiter = 1000)
half.width 
numeric vector of (positive) halfwidths.
Missing ( 
n.mean 
numeric vector of positive integers specifying the sample size associated with
the 
k 
numeric vector of positive integers specifying the number of future observations
or averages the prediction interval should contain with confidence level

sigma.hat 
numeric vector specifying the value(s) of the estimated standard deviation(s).
The default value is 
method 
character string specifying the method to use if the number of future observations
( 
conf.level 
numeric vector of values between 0 and 1 indicating the confidence level of the
prediction interval. The default value is 
round.up 
logical scalar indicating whether to round up the values of the computed sample
size(s) to the next smallest integer. The default value is 
n.max 
positive integer greater than 1 indicating the maximum possible sample size. The
default value is 
tol 
numeric scalar indicating the tolerance to use in the 
maxiter 
positive integer indicating the maximum number of iterations to use in the

If the arguments half.width
, k
, n.mean
, sigma.hat
, and
conf.level
are not all the same length, they are replicated to be the same
length as the length of the longest argument.
The help files for predIntNorm
and predIntNormK
give formulas for a twosided prediction interval based on the sample size, the
observed sample mean and sample standard deviation, and specified confidence level.
Specifically, the twosided prediction interval is given by:
[\bar{x}  Ks, \bar{x} + Ks] \;\;\;\;\;\; (1)
where \bar{x}
denotes the sample mean:
\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i \;\;\;\;\;\; (2)
s
denotes the sample standard deviation:
s^2 = \frac{1}{n1} \sum_{i=1}^n (x_i  \bar{x})^2 \;\;\;\;\;\; (3)
and K
denotes a constant that depends on the sample size n
, the
confidence level, the number of future averages k
, and the
sample size associated with the future averages, m
(see the help file for
predIntNormK
). Thus, the halfwidth of the prediction interval is
given by:
HW = Ks \;\;\;\;\;\; (4)
The function predIntNormN
uses the uniroot
search algorithm to
determine the sample size for specified values of the halfwidth, number of
observations used to create a single future average, number of future observations or
averages, the sample standard deviation, and the confidence level. Note that
unlike a confidence interval, the halfwidth of a prediction interval does not
approach 0 as the sample size increases.
numeric vector of sample sizes.
See the help file for predIntNorm
.
Steven P. Millard (EnvStats@ProbStatInfo.com)
See the help file for predIntNorm
.
predIntNorm
, predIntNormK
,
predIntNormHalfWidth
, plotPredIntNormDesign
.
# Look at how the required sample size for a prediction interval increases
# with increasing number of future observations:
1:5
#[1] 1 2 3 4 5
predIntNormN(half.width = 3, k = 1:5)
#[1] 6 9 11 14 18
#
# Look at how the required sample size for a prediction interval decreases
# with increasing halfwidth:
2:5
#[1] 2 3 4 5
predIntNormN(half.width = 2:5)
#[1] 86 6 4 3
predIntNormN(2:5, round = FALSE)
#[1] 85.567387 5.122911 3.542393 2.987861
#
# Look at how the required sample size for a prediction interval increases
# with increasing estimated standard deviation for a fixed halfwidth:
seq(0.5, 2, by = 0.5)
#[1] 0.5 1.0 1.5 2.0
predIntNormN(half.width = 4, sigma.hat = seq(0.5, 2, by = 0.5))
#[1] 3 4 7 86
#
# Look at how the required sample size for a prediction interval increases
# with increasing confidence level for a fixed halfwidth:
seq(0.5, 0.9, by = 0.1)
#[1] 0.5 0.6 0.7 0.8 0.9
predIntNormN(half.width = 2, conf.level = seq(0.5, 0.9, by = 0.1))
#[1] 2 2 3 4 9
#==========
# The data frame EPA.92c.arsenic3.df contains arsenic concentrations (ppb)
# collected quarterly for 3 years at a background well and quarterly for
# 2 years at a compliance well. Using the data from the background well,
# compute the required sample size in order to achieve a halfwidth of
# 2.25, 2.5, or 3 times the estimated standard deviation for a twosided
# 90% prediction interval for k=4 future observations.
#
# For a halfwidth of 2.25 standard deviations, the required sample size is 526,
# or about 131 years of quarterly observations! For a halfwidth of 2.5
# standard deviations, the required sample size is 20, or about 5 years of
# quarterly observations. For a halfwidth of 3 standard deviations, the required
# sample size is 9, or about 2 years of quarterly observations.
EPA.92c.arsenic3.df
# Arsenic Year Well.type
#1 12.6 1 Background
#2 30.8 1 Background
#3 52.0 1 Background
#...
#18 3.8 5 Compliance
#19 2.6 5 Compliance
#20 51.9 5 Compliance
mu.hat < with(EPA.92c.arsenic3.df,
mean(Arsenic[Well.type=="Background"]))
mu.hat
#[1] 27.51667
sigma.hat < with(EPA.92c.arsenic3.df,
sd(Arsenic[Well.type=="Background"]))
sigma.hat
#[1] 17.10119
predIntNormN(half.width=c(2.25, 2.5, 3) * sigma.hat, k = 4,
sigma.hat = sigma.hat, conf.level = 0.9)
#[1] 526 20 9
#==========
# Clean up
#
rm(mu.hat, sigma.hat)