tolIntNormK {EnvStats} | R Documentation |
Compute the Value of K
for a Tolerance Interval for a Normal Distribution
Description
Compute the value of K
(the multiplier of estimated standard deviation) used
to construct a tolerance interval based on data from a normal distribution.
Usage
tolIntNormK(n, df = n - 1, coverage = 0.95, cov.type = "content",
ti.type = "two-sided", conf.level = 0.95, method = "exact",
rel.tol = 1e-07, abs.tol = rel.tol)
Arguments
n |
a positive integer greater than 2 indicating the sample size upon which the tolerance interval is based. |
df |
the degrees of freedom associated with the tolerance interval. The default is
|
coverage |
a scalar between 0 and 1 indicating the desired coverage of the tolerance interval.
The default value is |
cov.type |
character string specifying the coverage type for the tolerance interval.
The possible values are |
ti.type |
character string indicating what kind of tolerance interval to compute.
The possible values are |
conf.level |
a scalar between 0 and 1 indicating the confidence level associated with the tolerance
interval. The default value is |
method |
for the case of a two-sided tolerance interval, a character string specifying the method for
constructing the tolerance interval. This argument is ignored if |
rel.tol |
in the case when |
abs.tol |
in the case when |
Details
A tolerance interval for some population is an interval on the real line constructed so as to
contain 100 \beta \%
of the population (i.e., 100 \beta \%
of all future observations),
where 0 < \beta < 1
. The quantity 100 \beta \%
is called the coverage.
There are two kinds of tolerance intervals (Guttman, 1970):
A
\beta
-content tolerance interval with confidence level100(1-\alpha)\%
is constructed so that it contains at least100 \beta \%
of the population (i.e., the coverage is at least100 \beta \%
) with probability100(1-\alpha)\%
, where0 < \alpha < 1
. The quantity100(1-\alpha)\%
is called the confidence level or confidence coefficient associated with the tolerance interval.A
\beta
-expectation tolerance interval is constructed so that the average coverage of the interval is100 \beta \%
.
Note: A \beta
-expectation tolerance interval with coverage 100 \beta \%
is
equivalent to a prediction interval for one future observation with associated confidence level
100 \beta \%
. Note that there is no explicit confidence level associated with a
\beta
-expectation tolerance interval. If a \beta
-expectation tolerance interval is
treated as a \beta
-content tolerance interval, the confidence level associated with this
tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).
For a normal distribution, the form of a two-sided 100(1-\alpha)\%
tolerance
interval is:
[\bar{x} - Ks, \, \bar{x} + Ks]
where \bar{x}
denotes the sample
mean, s
denotes the sample standard deviation, and K
denotes a constant
that depends on the sample size n
, the coverage, and, for a \beta
-content
tolerance interval (but not a \beta
-expectation tolerance interval),
the confidence level.
Similarly, the form of a one-sided lower tolerance interval is:
[\bar{x} - Ks, \, \infty]
and the form of a one-sided upper tolerance interval is:
[-\infty, \, \bar{x} + Ks]
but K
differs for one-sided versus two-sided tolerance intervals.
The Derivation of K
for a \beta
-Content Tolerance Interval
One-Sided Case
When ti.type="upper"
or ti.type="lower"
, the constant K
for a
100 \beta \%
\beta
-content tolerance interval with associated
confidence level 100(1 - \alpha)\%
is given by:
K = t(n-1, 1 - \alpha, z_\beta \sqrt{n}) / \sqrt{n}
where t(\nu, p, \delta)
denotes the p
'th quantile of a non-central
t-distribution with \nu
degrees of freedom and noncentrality parameter
\delta
(see the help file for TDist), and z_p
denotes the
p
'th quantile of a standard normal distribution.
Two-Sided Case
When ti.type="two-sided"
and method="exact"
, the exact formula for
the constant K
for a 100 \beta \%
\beta
-content tolerance interval
with associated confidence level 100(1-\alpha)\%
requires numerical integration
and has been derived by several different authors, including Odeh (1978),
Eberhardt et al. (1989), Jilek (1988), Fujino (1989), and Janiga and Miklos (2001).
Specifically, for given values of the sample size n
, degrees of freedom \nu
,
confidence level (1-\alpha)
, and coverage \beta
, the constant K
is the
solution to the equation:
\sqrt{\frac{n}{2 \pi}} \, \int^\infty_{-\infty} {F(x, K, \nu, R) \, e^{(-nx^2)/2}} \, dx = 1 - \alpha
where F(x, K, \nu, R)
denotes the upper-tail area from (\nu \, R^2) / K^2
to
\infty
of the chi-squared distribution with \nu
degrees of freedom, and
R
is the solution to the equation:
\Phi (x + R) - \Phi (x - R) = \beta
where
\Phi()
denotes the standard normal cumulative distribuiton function.
When ti.type="two-sided"
and method="wald.wolfowitz"
, the approximate formula
due to Wald and Wolfowitz (1946) for the constant K
for a 100 \beta \%
\beta
-content tolerance interval with associated confidence level
100(1-\alpha)\%
is given by:
K \approx r \, u
where r
is the solution to the equation:
\Phi (\frac{1}{\sqrt{n}} + r) - \Phi (\frac{1}{\sqrt{n}} - r) = \beta
\Phi ()
denotes the standard normal cumulative distribuiton function, and u
is
given by:
u = \sqrt{\frac{n-1}{\chi^{2} (n-1, \alpha)}}
where \chi^{2} (\nu, p)
denotes the p
'th quantile of the chi-squared
distribution with \nu
degrees of freedom.
The Derivation of K
for a \beta
-Expectation Tolerance Interval
As stated above, a \beta
-expectation tolerance interval with coverage 100 \beta \%
is
equivalent to a prediction interval for one future observation with associated confidence level
100 \beta \%
. This is because the probability that any single future observation will fall
into this interval is 100 \beta \%
, so the distribution of the number of N
future
observations that will fall into this interval is binomial with parameters size =
N
and
prob =
\beta
(see the help file for Binomial). Hence the expected proportion
of future observations that will fall into this interval is 100 \beta \%
and is independent of
the value of N
. See the help file for predIntNormK
for information on
how to derive K
for these intervals.
Value
The value of K
, a numeric scalar used to construct tolerance intervals for a normal
(Gaussian) distribution.
Note
Tabled values of K
are given in Gibbons et al. (2009), Gilbert (1987),
Guttman (1970), Krishnamoorthy and Mathew (2009), Owen (1962), Odeh and Owen (1980),
and USEPA (2009).
Tolerance intervals have long been applied to quality control and life testing problems (Hahn, 1970b,c; Hahn and Meeker, 1991; Krishnamoorthy and Mathew, 2009). References that discuss tolerance intervals in the context of environmental monitoring include: Berthouex and Brown (2002, Chapter 21), Gibbons et al. (2009), Millard and Neerchal (2001, Chapter 6), Singh et al. (2010b), and USEPA (2009).
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton.
Draper, N., and H. Smith. (1998). Applied Regression Analysis. Third Edition. John Wiley and Sons, New York.
Eberhardt, K.R., R.W. Mee, and C.P. Reeve. (1989). Computing Factors for Exact Two-Sided Tolerance Limits for a Normal Distribution. Communications in Statistics, Part B-Simulation and Computation 18, 397-413.
Ellison, B.E. (1964). On Two-Sided Tolerance Intervals for a Normal Distribution. Annals of Mathematical Statistics 35, 762-772.
Fujino, T. (1989). Exact Two-Sided Tolerance Limits for a Normal Distribution. Japanese Journal of Applied Statistics 18, 29-36.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York.
Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.
Hahn, G.J. (1970b). Statistical Intervals for a Normal Population, Part I: Tables, Examples and Applications. Journal of Quality Technology 2(3), 115-125.
Hahn, G.J. (1970c). Statistical Intervals for a Normal Population, Part II: Formulas, Assumptions, Some Derivations. Journal of Quality Technology 2(4), 195-206.
Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.
Jilek, M. (1988). Statisticke Tolerancni Meze. SNTL, Praha.
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Janiga, I., and R. Miklos. (2001). Statistical Tolerance Intervals for a Normal Distribution. Measurement Science Review 11, 29-32.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.
Odeh, R.E. (1978). Tables of Two-Sided Tolerance Factors for a Normal Distribution. Communications in Statistics, Part B-Simulation and Computation 7, 183-201.
Odeh, R.E., and D.B. Owen. (1980). Tables for Normal Tolerance Limits, Sampling Plans, and Screening. Marcel Dekker, New York.
Owen, D.B. (1962). Handbook of Statistical Tables. Addison-Wesley, Reading, MA.
Singh, A., R. Maichle, and N. Armbya. (2010a). ProUCL Version 4.1.00 User Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.
Wald, A., and J. Wolfowitz. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics 17, 208-215.
See Also
tolIntNorm
, predIntNorm
, Normal,
estimate.object
, enorm
, eqnorm
,
Tolerance Intervals, Prediction Intervals,
Estimating Distribution Parameters,
Estimating Distribution Quantiles.
Examples
# Compute the value of K for a two-sided 95% beta-content
# tolerance interval with associated confidence level 95%
# given a sample size of n=20.
#----------
# Exact method
tolIntNormK(n = 20)
#[1] 2.760346
#----------
# Approximate method due to Wald and Wolfowitz (1946)
tolIntNormK(n = 20, method = "wald")
# [1] 2.751789
#--------------------------------------------------------------------
# Compute the value of K for a one-sided upper tolerance limit
# with 99% coverage and associated confidence level 90%
# given a samle size of n=20.
tolIntNormK(n = 20, ti.type = "upper", coverage = 0.99,
conf.level = 0.9)
#[1] 3.051543
#--------------------------------------------------------------------
# Example 17-3 of USEPA (2009, p. 17-17) shows how to construct a
# beta-content upper tolerance limit with 95% coverage and 95%
# confidence using chrysene data and assuming a lognormal
# distribution. The sample size is n = 8 observations from
# the two compliance wells. Here we will compute the
# multiplier for the log-transformed data.
tolIntNormK(n = 8, ti.type = "upper")
#[1] 3.187294