R: Estimate Parameter of a Poisson Distribution

epois {EnvStats}

R Documentation

Estimate Parameter of a Poisson Distribution

Description

Estimate the mean of a Poisson distribution, and optionally construct a confidence interval for the mean.

Usage

  epois(x, method = "mle/mme/mvue", ci = FALSE, ci.type = "two-sided", 
    ci.method = "exact", conf.level = 0.95)

Arguments

`x`	numeric vector of observations.
`method`	character string specifying the method of estimation. Currently the only possible value is `"mle/mme/mvue"` (maximum likelihood/method of moments/minimum variance unbiased; the default). See the DETAILS section for more information.
`ci`	logical scalar indicating whether to compute a confidence interval for the location or scale parameter. The default value is `FALSE`.
`ci.type`	character string indicating what kind of confidence interval to compute. The possible values are `"two-sided"` (the default), `"lower"`, and `"upper"`. This argument is ignored if `ci=FALSE`.
`ci.method`	character string indicating what method to use to construct the confidence interval for the location or scale parameter. Possible values are `"exact"` (the default), `"pearson.hartley.approx"` (Pearson-Hartley approximation), and `"normal.approx"` (normal approximation). See the DETAILS section for more information. This argument is ignored if `ci=FALSE`.
`conf.level`	a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is `conf.level=0.95`. This argument is ignored if `ci=FALSE`.

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

Let \underline{x} = (x_1, x_2, \ldots, x_n) be a vector of n observations from a Poisson distribution with parameter lambda=\lambda. It can be shown (e.g., Forbes et al., 2009) that if y is defined as:

y = \sum_{i=1}^n x_i \;\;\;\; (1)

then y is an observation from a Poisson distribution with parameter lambda=n \lambda.

Estimation
The maximum likelihood, method of moments, and minimum variance unbiased estimator (mle/mme/mvue) of \lambda is given by:

\hat{\lambda} = \bar{x} \;\;\;\; (2)

where

\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i = \frac{y}{n} \;\;\;\; (3)

Confidence Intervals
There are three possible ways to construct a confidence interval for \lambda: based on the exact distribution of the estimator of \lambda (ci.type="exact"), based on an approximation of Pearson and Hartley (ci.type="pearson.hartley.approx"), or based on the normal approximation
(ci.type="normal.approx").

Exact Confidence Interval (ci.method="exact")
If ci.type="two-sided", an exact (1-\alpha)100\% confidence interval for \lambda can be constructed as [LCL, UCL], where the confidence limits are computed such that:

Pr[Y \ge y \| \lambda = LCL] = \frac{\alpha}{2} \;\;\;\; (4)

Pr[Y \le y \| \lambda = UCL] = \frac{\alpha}{2} \;\;\;\; (5)

where y is defined in equation (1) and Y denotes a Poisson random variable with parameter lambda=n \lambda.

If ci.type="lower", \alpha/2 is replaced with \alpha in equation (4) and UCL is set to \infty.

If ci.type="upper", \alpha/2 is replaced with \alpha in equation (5) and LCL is set to 0.

Note that an exact upper confidence bound can be computed even when all observations are 0.

Pearson-Hartley Approximation (ci.method="pearson.hartley.approx")
For a two-sided (1-\alpha)100\% confidence interval for \lambda, the Pearson and Hartley approximation (Zar, 2010, p.587; Pearson and Hartley, 1970, p.81) is given by:

[\frac{\chi^2_{2n\bar{x}, \alpha/2}}{2n}, \frac{\chi^2_{2n\bar{x} + 2, 1 - \alpha/2}}{2n}] \;\;\;\; (6)

where \chi^2_{\nu, p} denotes the p'th quantile of the chi-square distribution with \nu degrees of freedom. One-sided confidence intervals are computed in a similar fashion.

Normal Approximation (ci.method="normal.approx") An approximate (1-\alpha)100\% confidence interval for \lambda can be constructed assuming the distribution of the estimator of \lambda is approximately normally distributed. A two-sided confidence interval is constructed as:

[\hat{\lambda} - z_{1-\alpha/2} \hat{\sigma}_{\hat{\lambda}}, \hat{\lambda} + z_{1-\alpha/2} \hat{\sigma}_{\hat{\lambda}}] \;\;\;\; (7)

where z_p is the p'th quantile of the standard normal distribution, and the quantity

\hat{\sigma}_{\hat{\lambda}} = \sqrt{\hat{\lambda} / n} \;\;\;\; (8)

denotes the estimated asymptotic standard deviation of the estimator of \lambda.

One-sided confidence intervals are constructed in a similar manner.

Value

a list of class "estimate" containing the estimated parameters and other information.
See estimate.object for details.

Note

The Poisson distribution is named after Poisson, who derived this distribution as the limiting distribution of the binomial distribution with parameters size=N and prob=p, where N tends to infinity, p tends to 0, and Np stays constant.

In this context, the Poisson distribution was used by Bortkiewicz (1898) to model the number of deaths (per annum) from kicks by horses in Prussian Army Corps. In this case, p, the probability of death from this cause, was small, but the number of soldiers exposed to this risk, N, was large.

The Poisson distribution has been applied in a variety of fields, including quality control (modeling number of defects produced in a process), ecology (number of organisms per unit area), and queueing theory. Gibbons (1987b) used the Poisson distribution to model the number of detected compounds per scan of the 32 volatile organic priority pollutants (VOC), and also to model the distribution of chemical concentration (in ppb).

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572-580.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.

Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.

Pearson, E.S., and H.O. Hartley, eds. (1970). Biometrika Tables for Statisticians, Volume 1. Cambridge Universtiy Press, New York, p.81.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, pp. 585–586.

Examples

  # Generate 20 observations from a Poisson distribution with parameter 
  # lambda=2, then estimate the parameter and construct a 90% confidence 
  # interval. 
  # (Note: the call to set.seed simply allows you to reproduce this example.)

  set.seed(250) 
  dat <- rpois(20, lambda = 2) 
  epois(dat, ci = TRUE, conf.level = 0.9) 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Poisson
  #
  #Estimated Parameter(s):          lambda = 1.8
  #
  #Estimation Method:               mle/mme/mvue
  #
  #Data:                            dat
  #
  #Sample Size:                     20
  #
  #Confidence Interval for:         lambda
  #
  #Confidence Interval Method:      exact
  #
  #Confidence Interval Type:        two-sided
  #
  #Confidence Level:                90%
  #
  #Confidence Interval:             LCL = 1.336558
  #                                 UCL = 2.377037

  #----------

  # Compare the different ways of constructing confidence intervals for 
  # lambda using the same data as in the previous example:

  epois(dat, ci = TRUE, ci.method = "pearson", 
    conf.level = 0.9)$interval$limits 
  #     LCL      UCL 
  #1.336558 2.377037

  epois(dat, ci = TRUE, ci.method = "normal.approx",  
    conf.level = 0.9)$interval$limits 
  #     LCL      UCL 
  #1.306544 2.293456 

  #----------

  # Clean up
  #---------

  rm(dat)

[Package EnvStats version 2.8.1 Index]