R: Checks the fit of a Poisson Distribution

check_pois {ptools}

R Documentation

Checks the fit of a Poisson Distribution

Description

Provides a frequency table to check the fit of a Poisson distribution to empirical data.

Usage

check_pois(counts, min_val, max_val, pred, silent = FALSE)

Arguments

`counts`	vector of counts, e.g. c(0,5,1,3,4,6)
`min_val`	scaler minimum value to generate the grid of results, e.g. `0`
`max_val`	scaler maximum value to generate the grid of results, e.g. `max(counts)`
`pred`	can either be a scaler, e.g. `mean(counts)`, or a vector (e.g. predicted values from a Poisson regression)
`silent`	boolean, do not print mean/var stat messages, only applies when passing scaler for pred (default `FALSE`)

Details

Given either a scaler mean to test the fit, or a set of predictions (e.g. varying means predicted from a model), checks whether the data fits a given Poisson distribution over a specified set of integers. That is it builds a table of integer counts, and calculates the observed vs the expected distribution according to Poisson. Useful for checking any obvious deviations.

Value

A dataframe with columns

Int, the integer value
Freq, the total observed counts within that Integer value
PoisF, the expected counts according to a Poisson distribution with mean/pred specified
ResidF, the residual from Freq - PoisF
Prop, the observed proportion of that integer (0-100 scale)
PoisD, the expected proportion of that integer (0-100 scale)
ResidD, the residual from Prop - PoisD

Examples

# Example use for constant over the whole sample
set.seed(10)
lambda <- 0.2
x <- rpois(10000,lambda)
pfit <- check_pois(x,0,max(x),mean(x))
print(pfit)
# 82% zeroes is not zero inflated -- expected according to Poisson!

# Example use if you have varying predictions, eg after Poisson regression
n <- 10000
ru <- runif(n,0,10)
x <- rpois(n,lambda=ru)
check_pois(x, 0, 23, ru)

# If you really want to do a statistical test of fit
chi_stat <- sum((pfit$Freq - pfit$PoisF)^2/pfit$PoisF)
df <- length(pfit$Freq) - 2
stats::dchisq(chi_stat, df) #p-value
# I prefer evaluating specific integers though (e.g. zero-inflated, longer-tails, etc.)

[Package ptools version 2.0.0 Index]