tolIntNpar {EnvStats}  R Documentation 
Construct a \beta
content or \beta
expectation tolerance interval
nonparametrically without making any assumptions about the form of the
distribution except that it is continuous.
tolIntNpar(x, coverage, conf.level, cov.type = "content",
ltl.rank = ifelse(ti.type == "upper", 0, 1),
n.plus.one.minus.utl.rank = ifelse(ti.type == "lower", 0, 1),
lb = Inf, ub = Inf, ti.type = "twosided")
x 
numeric vector of observations. Missing ( 
coverage 
a scalar between 0 and 1 indicating the desired coverage of the 
conf.level 
a scalar between 0 and 1 indicating the confidence level associated with the 
cov.type 
character string specifying the coverage type for the tolerance interval.
The possible values are 
ltl.rank 
positive integer indicating the rank of the order statistic to use for the lower bound
of the tolerance interval. If 
n.plus.one.minus.utl.rank 
positive integer related to the rank of the order statistic to use for
the upper bound of the toleracne interval. A value of

lb , ub 
scalars indicating lower and upper bounds on the distribution. By default, 
ti.type 
character string indicating what kind of tolerance interval to compute.
The possible values are 
A tolerance interval for some population is an interval on the real line constructed so as to
contain 100 \beta \%
of the population (i.e., 100 \beta \%
of all
future observations), where 0 < \beta < 1
. The quantity 100 \beta \%
is called
the coverage.
There are two kinds of tolerance intervals (Guttman, 1970):
A \beta
content tolerance interval with confidence level 100(1\alpha)\%
is
constructed so that it contains at least 100 \beta \%
of the population (i.e., the
coverage is at least 100 \beta \%
) with probability 100(1\alpha)\%
, where
0 < \alpha < 1
. The quantity 100(1\alpha)\%
is called the confidence level or
confidence coefficient associated with the tolerance interval.
A \beta
expectation tolerance interval is constructed so that the average coverage of
the interval is 100 \beta \%
.
Note: A \beta
expectation tolerance interval with coverage 100 \beta \%
is
equivalent to a prediction interval for one future observation with associated confidence level
100 \beta \%
. Note that there is no explicit confidence level associated with a
\beta
expectation tolerance interval. If a \beta
expectation tolerance interval is
treated as a \beta
content tolerance interval, the confidence level associated with this
tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).
The Form of a Nonparametric Tolerance Interval
Let \underline{x}
denote a random sample of n
independent observations
from some continuous distribution and let x_{(i)}
denote the i
'th order
statistic in \underline{x}
. A twosided nonparametric tolerance interval is
constructed as:
[x_{(u)}, x_{(v)}] \;\;\;\;\;\; (1)
where u
and v
are positive integers between 1
and n
, and
u < v
. That is, u
denotes the rank of the lower tolerance limit, and
v
denotes the rank of the upper tolerance limit. To make it easier to write
some equations later on, we can also write the tolerance interval (1) in a slightly
different way as:
[x_{(u)}, x_{(n+1w)}] \;\;\;\;\;\; (2)
where
w = n + 1  v \;\;\;\;\;\; (3)
so that w
is a positive integer between 1
and n1
, and u < n+1w
.
In terms of the arguments to the function tolIntNpar
, the argument
ltl.rank
corresponds to u
, and the argument n.plus.one.minus.utl.rank
corresponds to w
.
If we allow u=0
and w=0
and define lower and upper bounds as:
x_{(0)} = lb \;\;\;\;\;\; (4)
x_{(n+1)} = ub \;\;\;\;\;\; (5)
then equation (2) above can also represent a onesided lower or onesided upper tolerance interval as well. That is, a onesided lower nonparametric tolerance interval is constructed as:
[x_{(u)}, x_{(n+1)}] = [x_{(u)}, ub] \;\;\;\;\;\; (6)
and a onesided upper nonparametric tolerance interval is constructed as:
[x_{(0)}, x_{(v)}] = [lb, x_{(v)}] \;\;\;\;\;\; (7)
Usually, lb = \infty
or lb = 0
and ub = \infty
.
Let C
be a random variable denoting the coverage of the above nonparametric
tolerance intervals. Wilks (1941) showed that the distribution of C
follows a
beta distribution with parameters shape1=
vu
and
shape2=
w+u
when the unknown distribution is continuous.
Computations for a \beta
Content Tolerance Interval
For a \beta
content tolerance interval, if the coverage C = \beta
is specified,
then the associated confidence level (1\alpha)100\%
is computed as:
1  \alpha = 1  F(\beta, vu, w+u) \;\;\;\;\;\; (8)
where F(y, \delta, \gamma)
denotes the cumulative distribution function of a
beta random variable with parameters shape1=
\delta
and
shape2=
\gamma
evaluated at y
.
Similarly, if the confidence level associated with the tolerance interval is specified as
(1\alpha)100\%
, then the coverage C = \beta
is computed as:
\beta = B(\alpha, vu, w+u) \;\;\;\;\;\; (9)
where B(p, \delta, \gamma)
denotes the p
'th quantile of a
beta distribution with parameters shape1=
\delta
and shape2=
\gamma
.
Computations for a \beta
Expectation Tolerance Interval
For a \beta
expectation tolerance interval, the expected coverage is simply
the mean of a beta random variable with parameters
shape1=
vu
and shape2=
w+u
, which is given by:
E(C) = \frac{vu}{n+1} \;\;\;\;\;\; (10)
As stated above, a \beta
expectation tolerance interval with coverage
\beta 100\%
is equivalent to a prediction interval for one future observation
with associated confidence level \beta 100\%
. This is because the probability
that any single future observation will fall into this interval is \beta 100\%
,
so the distribution of the number of N
future observations that will fall into
this interval is binomial with parameters size=
N
and prob=
\beta
. Hence the expected proportion of future observations
that fall into this interval is \beta 100\%
and is independent of the value of N
.
See the help file for predIntNpar
for more information on constructing
a nonparametric prediction interval.
A list of class "estimate"
containing the estimated parameters,
the tolerance interval, and other information. See estimate.object
for details.
Tolerance intervals have long been applied to quality control and life testing problems (Hahn, 1970b,c; Hahn and Meeker, 1991; Krishnamoorthy and Mathew, 2009). References that discuss tolerance intervals in the context of environmental monitoring include: Berthouex and Brown (2002, Chapter 21), Gibbons et al. (2009), Millard and Neerchal (2001, Chapter 6), Singh et al. (2010b), and USEPA (2009).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Conover, W.J. (1980). Practical Nonparametric Statistics. Second Edition. John Wiley and Sons, New York.
Danziger, L., and S. Davis. (1964). Tables of DistributionFree Tolerance Limits. Annals of Mathematical Statistics 35(5), 1361–1365.
Davis, C.B. (1994). Environmental Regulatory Statistics. In Patil, G.P., and C.R. Rao, eds., Handbook of Statistics, Vol. 12: Environmental Statistics. NorthHolland, Amsterdam, a division of Elsevier, New York, NY, Chapter 26, 817–865.
Davis, C.B., and R.J. McNichols. (1994a). Ground Water Monitoring Statistics Update: Part I: Progress Since 1988. Ground Water Monitoring and Remediation 14(4), 148–158.
Gibbons, R.D. (1991b). Statistical Tolerance Limits for GroundWater Monitoring. Ground Water 29, 563–570.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT, Chapter 2.
Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York, 392pp.
Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY, pp.8890.
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with SPLUS. CRC Press, Boca Raton.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
Wilks, S.S. (1941). Determination of Sample Sizes for Setting Tolerance Limits. Annals of Mathematical Statistics 12, 91–96.
eqnpar
, estimate.object
,
tolIntNparN
, Tolerance Intervals,
Estimating Distribution Parameters, Estimating Distribution Quantiles.
# Generate 20 observations from a lognormal mixture distribution
# with parameters mean1=1, cv1=0.5, mean2=5, cv2=1, and p.mix=0.1.
# The exact twosided interval that contains 90% of this distribution is given by:
# [0.682312, 13.32052]. Use tolIntNpar to construct a twosided 90%
# \eqn{\beta}content tolerance interval. Note that the associated confidence level
# is only 61%. A larger sample size is required to obtain a larger confidence
# level (see the help file for tolIntNparN).
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(23)
dat < rlnormMixAlt(20, 1, 0.5, 5, 1, 0.1)
tolIntNpar(dat, coverage = 0.9)
#Results of Distribution Parameter Estimation
#
#
#Assumed Distribution: None
#
#Data: dat
#
#Sample Size: 20
#
#Tolerance Interval Coverage: 90%
#
#Coverage Type: content
#
#Tolerance Interval Method: Exact
#
#Tolerance Interval Type: twosided
#
#Confidence Level: 60.8253%
#
#Tolerance Limit Rank(s): 1 20
#
#Tolerance Interval: LTL = 0.5035035
# UTL = 9.9504662
#
# Clean up
rm(dat)
#
# Reproduce Example 174 on page 1721 of USEPA (2009). This example uses
# copper concentrations (ppb) from 3 background wells to set an upper
# limit for 2 compliance wells. The maximum value from the 3 wells is set
# to the 95% confidence upper tolerance limit, and we need to determine the
# coverage of this tolerance interval. The data are stored in EPA.92c.copper2.df.
# Note that even though these data are Type I left singly censored, it is still
# possible to compute an upper tolerance interval using any of the uncensored
# observations as the upper limit.
EPA.92c.copper2.df
# Copper.orig Copper Censored Month Well Well.type
#1 <5 5.0 TRUE 1 1 Background
#2 <5 5.0 TRUE 2 1 Background
#3 7.5 7.5 FALSE 3 1 Background
#...
#9 9.2 9.2 FALSE 1 2 Background
#10 <5 5.0 TRUE 2 2 Background
#11 <5 5.0 TRUE 3 2 Background
#...
#17 <5 5.0 TRUE 1 3 Background
#18 5.4 5.4 FALSE 2 3 Background
#19 6.7 6.7 FALSE 3 3 Background
#...
#29 6.2 6.2 FALSE 5 4 Compliance
#30 <5 5.0 TRUE 6 4 Compliance
#31 7.8 7.8 FALSE 7 4 Compliance
#...
#38 <5 5.0 TRUE 6 5 Compliance
#39 5.6 5.6 FALSE 7 5 Compliance
#40 <5 5.0 TRUE 8 5 Compliance
with(EPA.92c.copper2.df,
tolIntNpar(Copper[Well.type=="Background"],
conf.level = 0.95, lb = 0, ti.type = "upper"))
#Results of Distribution Parameter Estimation
#
#
#Assumed Distribution: None
#
#Data: Copper[Well.type == "Background"]
#
#Sample Size: 24
#
#Tolerance Interval Coverage: 88.26538%
#
#Coverage Type: content
#
#Tolerance Interval Method: Exact
#
#Tolerance Interval Type: upper
#
#Confidence Level: 95%
#
#Tolerance Limit Rank(s): 24
#
#Tolerance Interval: LTL = 0.0
# UTL = 9.2
#
# Repeat the last example, except compute an upper
# \eqn{\beta}expectation tolerance interval:
with(EPA.92c.copper2.df,
tolIntNpar(Copper[Well.type=="Background"],
cov.type = "expectation", lb = 0, ti.type = "upper"))
#Results of Distribution Parameter Estimation
#
#
#Assumed Distribution: None
#
#Data: Copper[Well.type == "Background"]
#
#Sample Size: 24
#
#Tolerance Interval Coverage: 96%
#
#Coverage Type: expectation
#
#Tolerance Interval Method: Exact
#
#Tolerance Interval Type: upper
#
#Tolerance Limit Rank(s): 24
#
#Tolerance Interval: LTL = 0.0
# UTL = 9.2