predIntNormTestPower {EnvStats} | R Documentation |

Compute the probability that at least one out of `k`

future observations
(or means) falls outside a prediction interval for `k`

future observations
(or means) for a normal distribution.

```
predIntNormTestPower(n, df = n - 1, n.mean = 1, k = 1, delta.over.sigma = 0,
pi.type = "upper", conf.level = 0.95)
```

`n` |
vector of positive integers greater than 2 indicating the sample size upon which the prediction interval is based. |

`df` |
vector of positive integers indicating the degrees of freedom associated with
the sample size. The default value is |

`n.mean` |
positive integer specifying the sample size associated with the future averages.
The default value is |

`k` |
vector of positive integers specifying the number of future observations that the
prediction interval should contain with confidence level |

`delta.over.sigma` |
vector of numbers indicating the ratio |

`pi.type` |
character string indicating what kind of prediction interval to compute.
The possible values are |

`conf.level` |
numeric vector of values between 0 and 1 indicating the confidence level of the
prediction interval. The default value is |

*What is a Prediction Interval?*

A prediction interval for some population is an interval on the real line
constructed so that it will contain `k`

future observations or averages
from that population with some specified probability `(1-\alpha)100\%`

,
where `0 < \alpha < 1`

and `k`

is some pre-specified positive integer.
The quantity `(1-\alpha)100\%`

is call the confidence coefficient or
confidence level associated with the prediction interval. The function
`predIntNorm`

computes a standard prediction interval based on a
sample from a normal distribution. The function `predIntNormTestPower`

computes the probability that at least one out of `k`

future observations or
averages will **not** be contained in the prediction interval,
where the population mean for the future observations is allowed to differ from
the population mean for the observations used to construct the prediction interval.

*The Form of a Prediction Interval*

Let `\underline{x} = x_1, x_2, \ldots, x_n`

denote a vector of `n`

observations from a normal distribution with parameters
`mean=`

`\mu`

and `sd=`

`\sigma`

. Also, let `m`

denote the
sample size associated with the `k`

future averages (i.e., `n.mean=`

`m`

).
When `m=1`

, each average is really just a single observation, so in the rest of
this help file the term “averages” will replace the phrase
“observations or averages”.

For a normal distribution, the form of a two-sided `(1-\alpha)100\%`

prediction
interval is:

`[\bar{x} - Ks, \bar{x} + Ks] \;\;\;\;\;\; (1)`

where `\bar{x}`

denotes the sample mean:

`\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i \;\;\;\;\;\; (2)`

`s`

denotes the sample standard deviation:

`s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2 \;\;\;\;\;\; (3)`

and `K`

denotes a constant that depends on the sample size `n`

, the
confidence level, the number of future averages `k`

, and the
sample size associated with the future averages, `m`

. Do not confuse the
constant `K`

(uppercase K) with the number of future averages `k`

(lowercase k). The symbol `K`

is used here to be consistent with the
notation used for tolerance intervals (see `tolIntNorm`

).

Similarly, the form of a one-sided lower prediction interval is:

`[\bar{x} - Ks, \infty] \;\;\;\;\;\; (4)`

and the form of a one-sided upper prediction interval is:

`[-\infty, \bar{x} + Ks] \;\;\;\;\;\; (5)`

but `K`

differs for one-sided versus two-sided prediction intervals.
The derivation of the constant `K`

is explained in the help file for
`predIntNormK`

.

*Computing Power*

The "power" of the prediction interval is defined as the probability that at
least one out of the `k`

future observations or averages
will **not** be contained in the prediction interval, where the population mean
for the future observations is allowed to differ from the population mean for the
observations used to construct the prediction interval. The probability `p`

that all `k`

future observations will be contained in a one-sided upper
prediction interval (`pi.type="upper"`

) is given in Equation (6) of the help
file for
`predIntNormSimultaneousK`

, where `k=m`

and `r=1`

:

`p = \int_0^1 T(\sqrt{n}K; n-1, \sqrt{n}[\Phi^{-1}(v) + \frac{\Delta}{\sigma}]) [\frac{v^{k-1}}{B(k, 1)}] dv \;\;\;\;\;\; (6)`

where `T(x; \nu, \delta)`

denotes the cdf of the
non-central Student's t-distribution with parameters
`df=`

`\nu`

and `ncp=`

`\delta`

evaluated at `x`

;
`\Phi(x)`

denotes the cdf of the standard normal distribution
evaluated at `x`

; and `B(\nu, \omega)`

denotes the value of the
beta function with parameters `a=`

`\nu`

and
`b=`

`\omega`

.

The quantity `\Delta`

(upper case delta) denotes the difference between the
mean of the population that was sampled to construct the prediction interval, and
the mean of the population that will be sampled to produce the future observations.
The quantity `\sigma`

(sigma) denotes the population standard deviation of both
of these populations. Usually you assume `\Delta=0`

unless you are interested
in computing the power of the rule to detect a change in means between the
populations, as we are here.

If we are interested in using averages instead of single observations, with
`w \ge 1`

(i.e., `n.mean`

`\ge 1`

), the first
term in the integral in Equation (6) that involves the cdf of the
non-central Student's t-distribution becomes:

`T(\sqrt{n}K; n-1, \frac{\sqrt{n}}{\sqrt{w}}[\Phi^{-1}(v) + \frac{\sqrt{w}\Delta}{\sigma}]) \;\;\;\;\;\; (7)`

For a given confidence level `(1-\alpha)100\%`

, the power of the rule to detect
a change in means is simply given by:

`Power = 1 - p \;\;\;\;\;\; (8)`

where `p`

is defined in Equation (6) above using the value of `K`

that
corresponds to `\Delta/\sigma = 0`

. Thus, when the argument
`delta.over.sigma=0`

, the value of `p`

is `1-\alpha`

and the power is
simply `\alpha 100\%`

. As `delta.over.sigma`

increases above 0, the power
increases.

When `pi.type="lower"`

, the same value of `K`

is used as when
`pi.type="upper"`

, but Equation (4) is used to construct the prediction
interval. Thus, the power increases as `delta.over.sigma`

decreases below 0.

vector of values between 0 and 1 equal to the probability that at least one of
`k`

future observations or averages will fall outside the prediction interval.

See the help files for `predIntNorm`

and
`predIntNormSimultaneous`

.

In the course of designing a sampling program, an environmental scientist may wish
to determine the relationship between sample size, significance level, power, and
scaled difference if one of the objectives of the sampling program is to determine
whether two distributions differ from each other. The functions
`predIntNormTestPower`

and `plotPredIntNormTestPowerCurve`

can be
used to investigate these relationships for the case of normally-distributed
observations. In the case of a simple shift between the two means, the test based
on a prediction interval is not as powerful as the two-sample t-test. However, the
test based on a prediction interval is more efficient at detecting a shift in the
tail.

Steven P. Millard (EnvStats@ProbStatInfo.com)

See the help files for `predIntNorm`

and
`predIntNormSimultaneous`

.

`predIntNorm`

, `predIntNormK`

,
`plotPredIntNormTestPowerCurve`

, `predIntNormSimultaneous`

,
`predIntNormSimultaneousK`

,
`predIntNormSimultaneousTestPower`

, Prediction Intervals,
Normal.

```
# Show how the power increases as delta.over.sigma increases.
# Assume a 95% upper prediction interval.
predIntNormTestPower(n = 4, delta.over.sigma = 0:2)
#[1] 0.0500000 0.1743014 0.3990892
#----------
# Look at how the power increases with sample size for a one-sided upper
# prediction interval with k=3, delta.over.sigma=2, and a confidence level
# of 95%.
predIntNormTestPower(n = c(4, 8), k = 3, delta.over.sigma = 2)
#[1] 0.3578250 0.5752113
#----------
# Show how the power for an upper 95% prediction limit increases as the
# number of future observations k increases. Here, we'll use n=20 and
# delta.over.sigma=1.
predIntNormTestPower(n = 20, k = 1:3, delta.over.sigma = 1)
#[1] 0.2408527 0.2751074 0.2936486
```

[Package *EnvStats* version 2.8.1 Index]