epois {EnvStats} | R Documentation |
Estimate Parameter of a Poisson Distribution
Description
Estimate the mean of a Poisson distribution, and optionally construct a confidence interval for the mean.
Usage
epois(x, method = "mle/mme/mvue", ci = FALSE, ci.type = "two-sided",
ci.method = "exact", conf.level = 0.95)
Arguments
x |
numeric vector of observations. |
method |
character string specifying the method of estimation. Currently the only possible
value is |
ci |
logical scalar indicating whether to compute a confidence interval for the
location or scale parameter. The default value is |
ci.type |
character string indicating what kind of confidence interval to compute. The
possible values are |
ci.method |
character string indicating what method to use to construct the confidence interval
for the location or scale parameter. Possible values are |
conf.level |
a scalar between 0 and 1 indicating the confidence level of the confidence interval.
The default value is |
Details
If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
Let be a vector of
observations from a Poisson distribution with
parameter
lambda=
. It can be shown (e.g., Forbes et al., 2009)
that if
is defined as:
then is an observation from a Poisson distribution with parameter
lambda=
.
Estimation
The maximum likelihood, method of moments, and minimum variance unbiased estimator
(mle/mme/mvue) of is given by:
where
Confidence Intervals
There are three possible ways to construct a confidence interval for
: based on the exact distribution of the estimator of
(
ci.type="exact"
), based on an approximation of
Pearson and Hartley (ci.type="pearson.hartley.approx"
), or based on the
normal approximation
(ci.type="normal.approx"
).
Exact Confidence Interval (ci.method="exact"
)
If ci.type="two-sided"
, an exact confidence interval
for
can be constructed as
, where the confidence
limits are computed such that:
where is defined in equation (1) and
denotes a Poisson random
variable with parameter
lambda=
.
If ci.type="lower"
, is replaced with
in
equation (4) and
is set to
.
If ci.type="upper"
, is replaced with
in
equation (5) and
is set to 0.
Note that an exact upper confidence bound can be computed even when all
observations are 0.
Pearson-Hartley Approximation (ci.method="pearson.hartley.approx"
)
For a two-sided confidence interval for
, the
Pearson and Hartley approximation (Zar, 2010, p.587; Pearson and Hartley, 1970, p.81)
is given by:
where denotes the
'th quantile of the
chi-square distribution with
degrees of freedom.
One-sided confidence intervals are computed in a similar fashion.
Normal Approximation (ci.method="normal.approx"
)
An approximate confidence interval for
can be
constructed assuming the distribution of the estimator of
is
approximately normally distributed. A two-sided confidence interval is constructed
as:
where is the
'th quantile of the standard normal distribution, and
the quantity
denotes the estimated asymptotic standard deviation of the estimator of
.
One-sided confidence intervals are constructed in a similar manner.
Value
a list of class "estimate"
containing the estimated parameters and other information.
See estimate.object
for details.
Note
The Poisson distribution is named after Poisson, who
derived this distribution as the limiting distribution of the
binomial distribution with parameters size=
and
prob=
, where
tends to infinity,
tends to 0, and
stays constant.
In this context, the Poisson distribution was used by Bortkiewicz (1898) to model
the number of deaths (per annum) from kicks by horses in Prussian Army Corps. In
this case, , the probability of death from this cause, was small, but the
number of soldiers exposed to this risk,
, was large.
The Poisson distribution has been applied in a variety of fields, including quality control (modeling number of defects produced in a process), ecology (number of organisms per unit area), and queueing theory. Gibbons (1987b) used the Poisson distribution to model the number of detected compounds per scan of the 32 volatile organic priority pollutants (VOC), and also to model the distribution of chemical concentration (in ppb).
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.
Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572-580.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.
Pearson, E.S., and H.O. Hartley, eds. (1970). Biometrika Tables for Statisticians, Volume 1. Cambridge Universtiy Press, New York, p.81.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, pp. 585–586.
See Also
Examples
# Generate 20 observations from a Poisson distribution with parameter
# lambda=2, then estimate the parameter and construct a 90% confidence
# interval.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat <- rpois(20, lambda = 2)
epois(dat, ci = TRUE, conf.level = 0.9)
#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution: Poisson
#
#Estimated Parameter(s): lambda = 1.8
#
#Estimation Method: mle/mme/mvue
#
#Data: dat
#
#Sample Size: 20
#
#Confidence Interval for: lambda
#
#Confidence Interval Method: exact
#
#Confidence Interval Type: two-sided
#
#Confidence Level: 90%
#
#Confidence Interval: LCL = 1.336558
# UCL = 2.377037
#----------
# Compare the different ways of constructing confidence intervals for
# lambda using the same data as in the previous example:
epois(dat, ci = TRUE, ci.method = "pearson",
conf.level = 0.9)$interval$limits
# LCL UCL
#1.336558 2.377037
epois(dat, ci = TRUE, ci.method = "normal.approx",
conf.level = 0.9)$interval$limits
# LCL UCL
#1.306544 2.293456
#----------
# Clean up
#---------
rm(dat)