tolIntPois {EnvStats} | R Documentation |
Tolerance Interval for a Poisson Distribution
Description
Construct a -content or
-expectation tolerance
interval for a Poisson distribution.
Usage
tolIntPois(x, coverage = 0.95, cov.type = "content", ti.type = "two-sided",
conf.level = 0.95)
Arguments
x |
numeric vector of observations, or an object resulting from a call to an
estimating function that assumes a Poisson distribution
(i.e., |
coverage |
a scalar between 0 and 1 indicating the desired coverage of the tolerance interval.
The default value is |
cov.type |
character string specifying the coverage type for the tolerance interval.
The possible values are |
ti.type |
character string indicating what kind of tolerance interval to compute.
The possible values are |
conf.level |
a scalar between 0 and 1 indicating the confidence level associated with the tolerance
interval. The default value is |
Details
If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
A tolerance interval for some population is an interval on the real line constructed so as to
contain of the population (i.e.,
of all
future observations), where
. The quantity
is called
the coverage.
There are two kinds of tolerance intervals (Guttman, 1970):
A
-content tolerance interval with confidence level
is constructed so that it contains at least
of the population (i.e., the coverage is at least
) with probability
, where
. The quantity
is called the confidence level or confidence coefficient associated with the tolerance interval.
A
-expectation tolerance interval is constructed so that the average coverage of the interval is
.
Note: A -expectation tolerance interval with coverage
is
equivalent to a prediction interval for one future observation with associated confidence level
. Note that there is no explicit confidence level associated with a
-expectation tolerance interval. If a
-expectation tolerance interval is
treated as a
-content tolerance interval, the confidence level associated with this
tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).
Because of the discrete nature of the Poisson distribution,
even true tolerance intervals (tolerance intervals based on the true value of
) will usually not contain exactly
of the population.
For example, for the Poisson distribution with parameter
lambda=2
, the
interval [0, 4] contains 94.7% of this distribution and the interval [0, 5]
contains 98.3% of this distribution. Thus, no interval can contain exactly 95%
of this distribution.
-Content Tolerance Intervals for a Poisson Distribution
Zacks (1970) showed that for monotone likelihood ratio (MLR) families of discrete
distributions, a uniformly most accurate upper
-content
tolerance interval with associated confidence level
is
constructed by finding the upper
confidence limit for the
parameter associated with the distribution, and then computing the
'th
quantile of the distribution assuming the true value of the parameter is equal to
the upper confidence limit. This idea can be extended to one-sided lower and
two-sided tolerance limits.
It can be shown that all distributions that are one parameter exponential families have the MLR property, and the Poisson distribution is a one-parameter exponential family, so the method of Zacks (1970) can be applied to a Poisson distribution.
Let denote a Poisson random variable with parameter
lambda=
. Let
denote the
'th quantile
of this distribution. That is,
Note that due to the discrete nature of the Poisson distribution, there will be
several values of associated with one value of
. For example, for
, the value 1 is the
'th quantile for any value of
between 0.140 and 0.406.
Let denote a vector of
observations from a
Poisson distribution with parameter
lambda=
.
When
ti.type="upper"
, the first step is to compute the one-sided upper
confidence limit for
based on the observations
(see the help file for
epois
). Denote this upper
confidence limit by . The one-sided upper
tolerance limit
is then given by:
Similarly, when ti.type="lower"
, the first step is to compute the one-sided
lower confidence limit for
based on the
observations
. Denote this lower confidence limit by
.
The one-sided lower
tolerance limit is then given by:
Finally, when ti.type="two-sided"
, the first step is to compute the two-sided
confidence limits for
based on the
observations
. Denote these confidence limits by
and
. The two-sided
tolerance limit is then given by:
Note that the function tolIntPois
uses the exact confidence limits for
when computing
-content tolerance limits (see
epois
).
-Expectation Tolerance Intervals for a Poisson Distribution
As stated above, a -expectation tolerance interval with coverage
is equivalent to a prediction interval for one future observation
with associated confidence level
. This is because the probability
that any single future observation will fall into this interval is
,
so the distribution of the number of
future observations that will fall into
this interval is binomial with parameters
size=
and
prob=
. Hence the expected proportion of
future observations that fall into this interval is
and is
independent of the value of
. See the help file for
predIntPois
for information on how these intervals are constructed.
Value
If x
is a numeric vector, tolIntPois
returns a list of class
"estimate"
containing the estimated parameters, a component called
interval
containing the tolerance interval information, and other
information. See estimate.object
for details.
If x
is the result of calling an estimation function, tolIntPois
returns a list whose class is the same as x
. The list contains the same
components as x
. If x
already has a component called
interval
, this component is replaced with the tolerance interval
information.
Note
Tolerance intervals have long been applied to quality control and life testing problems (Hahn, 1970b,c; Hahn and Meeker, 1991; Krishnamoorthy and Mathew, 2009). References that discuss tolerance intervals in the context of environmental monitoring include: Berthouex and Brown (2002, Chapter 21), Gibbons et al. (2009), Millard and Neerchal (2001, Chapter 6), Singh et al. (2010b), and USEPA (2009).
Gibbons (1987b) used the Poisson distribution to model the number of detected
compounds per scan of the 32 volatile organic priority pollutants (VOC), and
also to model the distribution of chemical concentration (in ppb). He explained
the derivation of a one-sided upper -content tolerance limit for a
Poisson distribution based on the work of Zacks (1970) using the Pearson-Hartley
approximation to the confidence limits for the mean parameter
(see the help file for
epois
). Note that there are several
typographical errors in the derivation and examples on page 575 of Gibbons (1987b)
because there is confusion between where the value of (the coverage)
should be and where the value of
(the confidence level) should be.
Gibbons et al. (2009, pp.103-104) gives correct formulas.
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572–580.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.
Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.
Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.
Zacks, S. (1970). Uniformly Most Accurate Upper Tolerance Limits for Monotone Likelihood Ratio Families of Discrete Distributions. Journal of the American Statistical Association 65, 307–316.
See Also
Poisson
, epois
, eqpois
,
estimate.object
, Tolerance Intervals,
Estimating Distribution Parameters, Estimating Distribution Quantiles.
Examples
# Generate 20 observations from a Poisson distribution with parameter
# lambda=2. The interval [0, 4] contains 94.7% of this distribution and
# the interval [0,5] contains 98.3% of this distribution. Thus, because
# of the discrete nature of the Poisson distribution, no interval contains
# exactly 95% of this distribution. Use tolIntPois to estimate the mean
# parameter of the true distribution, and construct a one-sided upper 95%
# beta-content tolerance interval with associated confidence level 90%.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat <- rpois(20, 2)
tolIntPois(dat, conf.level = 0.9)
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 1.8
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 20
#
#Tolerance Interval Coverage: 95%
#
#Coverage Type: content
#
#Tolerance Interval Method: Zacks
#
#Tolerance Interval Type: two-sided
#
#Confidence Level: 90%
#
#Tolerance Interval: LTL = 0
# UTL = 6
#------
# Clean up
rm(dat)