predIntNormTestPower {EnvStats} | R Documentation |
Probability That at Least One Future Observation Falls Outside a Prediction Interval for a Normal Distribution
Description
Compute the probability that at least one out of future observations
(or means) falls outside a prediction interval for
future observations
(or means) for a normal distribution.
Usage
predIntNormTestPower(n, df = n - 1, n.mean = 1, k = 1, delta.over.sigma = 0,
pi.type = "upper", conf.level = 0.95)
Arguments
n |
vector of positive integers greater than 2 indicating the sample size upon which the prediction interval is based. |
df |
vector of positive integers indicating the degrees of freedom associated with
the sample size. The default value is |
n.mean |
positive integer specifying the sample size associated with the future averages.
The default value is |
k |
vector of positive integers specifying the number of future observations that the
prediction interval should contain with confidence level |
delta.over.sigma |
vector of numbers indicating the ratio |
pi.type |
character string indicating what kind of prediction interval to compute.
The possible values are |
conf.level |
numeric vector of values between 0 and 1 indicating the confidence level of the
prediction interval. The default value is |
Details
What is a Prediction Interval?
A prediction interval for some population is an interval on the real line
constructed so that it will contain future observations or averages
from that population with some specified probability
,
where
and
is some pre-specified positive integer.
The quantity
is call the confidence coefficient or
confidence level associated with the prediction interval. The function
predIntNorm
computes a standard prediction interval based on a
sample from a normal distribution. The function predIntNormTestPower
computes the probability that at least one out of future observations or
averages will not be contained in the prediction interval,
where the population mean for the future observations is allowed to differ from
the population mean for the observations used to construct the prediction interval.
The Form of a Prediction Interval
Let denote a vector of
observations from a normal distribution with parameters
mean=
and
sd=
. Also, let
denote the
sample size associated with the
future averages (i.e.,
n.mean=
).
When
, each average is really just a single observation, so in the rest of
this help file the term “averages” will replace the phrase
“observations or averages”.
For a normal distribution, the form of a two-sided prediction
interval is:
where denotes the sample mean:
denotes the sample standard deviation:
and denotes a constant that depends on the sample size
, the
confidence level, the number of future averages
, and the
sample size associated with the future averages,
. Do not confuse the
constant
(uppercase K) with the number of future averages
(lowercase k). The symbol
is used here to be consistent with the
notation used for tolerance intervals (see
tolIntNorm
).
Similarly, the form of a one-sided lower prediction interval is:
and the form of a one-sided upper prediction interval is:
but differs for one-sided versus two-sided prediction intervals.
The derivation of the constant
is explained in the help file for
predIntNormK
.
Computing Power
The "power" of the prediction interval is defined as the probability that at
least one out of the future observations or averages
will not be contained in the prediction interval, where the population mean
for the future observations is allowed to differ from the population mean for the
observations used to construct the prediction interval. The probability
that all
future observations will be contained in a one-sided upper
prediction interval (
pi.type="upper"
) is given in Equation (6) of the help
file for
predIntNormSimultaneousK
, where and
:
where denotes the cdf of the
non-central Student's t-distribution with parameters
df=
and
ncp=
evaluated at
;
denotes the cdf of the standard normal distribution
evaluated at
; and
denotes the value of the
beta function with parameters
a=
and
b=
.
The quantity (upper case delta) denotes the difference between the
mean of the population that was sampled to construct the prediction interval, and
the mean of the population that will be sampled to produce the future observations.
The quantity
(sigma) denotes the population standard deviation of both
of these populations. Usually you assume
unless you are interested
in computing the power of the rule to detect a change in means between the
populations, as we are here.
If we are interested in using averages instead of single observations, with
(i.e.,
n.mean
), the first
term in the integral in Equation (6) that involves the cdf of the
non-central Student's t-distribution becomes:
For a given confidence level , the power of the rule to detect
a change in means is simply given by:
where is defined in Equation (6) above using the value of
that
corresponds to
. Thus, when the argument
delta.over.sigma=0
, the value of is
and the power is
simply
. As
delta.over.sigma
increases above 0, the power
increases.
When pi.type="lower"
, the same value of K
is used as when
pi.type="upper"
, but Equation (4) is used to construct the prediction
interval. Thus, the power increases as delta.over.sigma
decreases below 0.
Value
vector of values between 0 and 1 equal to the probability that at least one of
future observations or averages will fall outside the prediction interval.
Note
See the help files for predIntNorm
and
predIntNormSimultaneous
.
In the course of designing a sampling program, an environmental scientist may wish
to determine the relationship between sample size, significance level, power, and
scaled difference if one of the objectives of the sampling program is to determine
whether two distributions differ from each other. The functions
predIntNormTestPower
and plotPredIntNormTestPowerCurve
can be
used to investigate these relationships for the case of normally-distributed
observations. In the case of a simple shift between the two means, the test based
on a prediction interval is not as powerful as the two-sample t-test. However, the
test based on a prediction interval is more efficient at detecting a shift in the
tail.
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
See the help files for predIntNorm
and
predIntNormSimultaneous
.
See Also
predIntNorm
, predIntNormK
,
plotPredIntNormTestPowerCurve
, predIntNormSimultaneous
,
predIntNormSimultaneousK
,
predIntNormSimultaneousTestPower
, Prediction Intervals,
Normal.
Examples
# Show how the power increases as delta.over.sigma increases.
# Assume a 95% upper prediction interval.
predIntNormTestPower(n = 4, delta.over.sigma = 0:2)
#[1] 0.0500000 0.1743014 0.3990892
#----------
# Look at how the power increases with sample size for a one-sided upper
# prediction interval with k=3, delta.over.sigma=2, and a confidence level
# of 95%.
predIntNormTestPower(n = c(4, 8), k = 3, delta.over.sigma = 2)
#[1] 0.3578250 0.5752113
#----------
# Show how the power for an upper 95% prediction limit increases as the
# number of future observations k increases. Here, we'll use n=20 and
# delta.over.sigma=1.
predIntNormTestPower(n = 20, k = 1:3, delta.over.sigma = 1)
#[1] 0.2408527 0.2751074 0.2936486