R: Compute a mixture of a point mass at 0 and a log-concave...

kInflatedLogConDiscr {logcondiscr}

R Documentation

Compute a mixture of a point mass at 0 and a log-concave probability mass function from i.i.d. data

Description

Using an EM algorithm, compute an estimate of a mixture of a point mass at k and a log-concave probability mass function from discrete i.i.d. data.

Usage

kInflatedLogConDiscr(x, k = 0, prec1 = 1e-10, prec2 = 1e-15, 
    itermax = 200, output = TRUE, theta0 = 0.5, p0 = NA)

Arguments

`x`	Vector of observations.
`k`	Point at which inflation should be assumed. Must be in `x_1, x_1 + 1, \ldots, x_{n - 1}, x_n`.
`prec1`	Precision for stopping criterion.
`prec2`	Precision to remove ends of support in case weights `<prec2`.
`itermax`	Maximal number of iterations of EM algorithm.
`output`	Logical, if `TRUE`, progress of EM algorithm is shown.
`theta0`	Optional initialization for `\theta_0`, the point mass at `k`.
`p0`	Optional initialization for the PMF.

Details

Given a vector of observations {\bold{x}_n} = (x_1, \ldots, x_n) from a discrete PMF with a (potential) point mass at k (typically k = 0), kInflatedLogConDiscr computes a pmf that is a mixture between a point mass at k and a log-concave PMF on x. To accomplish this, an EM algorithm is used.

Value

A list containing the following elementes:

`z`	The support.
`f`	The estimated `k`-inflated log-concave PMF.
`E(L)`	The value of the expected composite log-likelihood at the maximum.
`loglik`	The value of the composite log-likelihood at the maximum.
`theta`	The estimated weight at `k`.
`logconc.pmf`	The log-concave part of the mixture.
`logconc.z`	The support of `logconc.pmf`.

Author(s)

Kaspar Rufibach (maintainer) kaspar.rufibach@gmail.com
http://www.kasparrufibach.ch
Fadoua Balabdaoui fadoua@ceremade.dauphine.fr
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski hkj@mathstat.yorku.ca
http://www.math.yorku.ca/~hkj
Kathrin Weyermann

References

Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.

Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).

Examples


## -----------------------------------------------
## generate zero-inflated negative binomial sample
## -----------------------------------------------
set.seed(2011)
n <- 100
theta <- 0.05
r <- 6
p <- 0.3
x <- rnbinom(n, r, p)

## inflate at 0
x <- ifelse(runif(n) <= theta, 0, x)

## estimate log-concave MLE
fit1 <- logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE)

## estimate zero-inflated log-concave MLE
fit2 <- kInflatedLogConDiscr(x, k = 0)

## plot the results
par(mfrow = c(1, 1), las = 1)
plot(fit1$x, exp(fit1$psi), type = "b", col = 2, xlim = range(x), xlab = "x", 
    ylim = c(0, max(exp(fit1$psi), fit2$f)), ylab = "PMF", 
    main = "Estimate MLE from a zero-inflated negative-binomial", pch = 19)
lines(fit2$z, fit2$f, type = "b", col = 4, pch = 15)

## add the true PMF we sampled from
z <- fit2$z
f.true <- theta * c(1, rep(0, length(z) - 1)) + (1 - theta) * dnbinom(z, r, p)
lines(z, f.true, col = 6, type = "b", pch = 17)

legend("topright", c("log-concave MLE", "zero-inflated log-concave MLE", 
    "true PMF"), col = c(2, 4, 6), lty = c(1, 1, 1), pch = c(19, 15, 17), 
    bty = "n")
    
## Not run: 
## -----------------------------------------------
## generate seven-inflated negative binomial sample
## -----------------------------------------------
theta <- 0.05
r <- 4
p <- 0.3
n <- 10000
x <- rnbinom(n, r, p)
x <- ifelse(runif(n) <= theta, 7, x)
x <- c(x, rep(7, 10))

## compute different estimates
zero.mle <- kInflatedLogConDiscr(x, k = 7)
mle <- logConDiscrMLE(x, output = FALSE)
f.mle <- exp(mle$psiSupp)
z<-	zero.mle$z
f1 <- theta * c(rep(0, 7 - min(x)), 1, rep(0, max(x) - 7))
f2 <- (1 - theta) * dnbinom(z, r, p)
f.true <- f1 + f2
true <- dnbinom(z, r, p)
f.fit <- zero.mle$f
xx <- sort(unique(x))
emp <- rep(0, length(z))
emp[xx - min(x) + 1] <- as.vector(table(x) / n)

## plot results
plot(z, f.true, type = "l", ylim = c(0, max(emp)), xlab = "x", 
    ylab = "PMF", main = "Illustration k-inflated estimator")
points(z, true, type = "l", lty = 2)
points(z, f.fit, type = "l", col = "red")
points(z, zero.mle$logconc.pmf, type = "l", col = "red", lty = 2)
points(min(x):max(x), f.mle, type = "l", col = "green")
points(z, emp, type = "l", col = "purple")
points(z, emp, col = "purple")
legend("topright", inset = 0.05, c("true", "true less seven", "seven-inflated", 
    "recovered", "logconc", "empirical"), lty=c(1, 2, 1, 2, 1, 1), col = c("black", 
    "black", "red", "red", "green", "purple"))

## zoom in near mode
subs <- 4:12
plot(z[subs], f.true[subs], type = "l", ylim = c(0, max(emp)), xlab = "x", 
    ylab = "PMF", main = "Illustration k-inflated estimator")
points(z[subs], true[subs], type = "l", lty = 2)
points(z[subs], f.fit[subs], type = "l", col = "red")
points(z[subs], zero.mle$logconc.pmf[subs], type = "l", col = "red", lty = 2)
points((min(x):max(x))[subs], f.mle[subs], type = "l", col = "green")
points(z[subs], emp[subs], type = "l", col = "purple")
points(z[subs], emp[subs], col = "purple")

## End(Not run)

[Package logcondiscr version 1.0.6 Index]