R: Compute normal mixtures

normalmixture {bayesmeta}

R Documentation

Compute normal mixtures

Description

This function allows to derive density, distribution function and quantile function of a normal mixture with fixed mean (\mu) and random standard deviation (\sigma).

Usage

  normalmixture(density,
                cdf = Vectorize(function(x){integrate(density,0,x)$value}),
                mu = 0, delta = 0.01, epsilon = 0.0001,
                rel.tol.integrate = 2^16*.Machine$double.eps,
                abs.tol.integrate = rel.tol.integrate,
                tol.uniroot = rel.tol.integrate)

Arguments

`density`	the `\sigma` mixing distribution's probability density `function`.
`cdf`	the `\sigma` mixing distribution's cumulative distribution `function`.
`mu`	the normal mean (`\mu`).
`delta`, `epsilon`	the parameters specifying the desired accuracy for approximation of the mixing distribution, and with that determining the number of `\sigma` support points being used internally. Smaller values imply greater accuracy and greater computational burden (Roever and Friede, 2017).
`rel.tol.integrate`, `abs.tol.integrate`, `tol.uniroot`	the `rel.tol`, `abs.tol` and `tol` ‘accuracy’ arguments that are passed to the `integrate()` or `uniroot()` functions for internal numerical integration or root finding (see also the help there).

Details

When a normal random variable

X|\mu,\sigma \;\sim\; \mathrm{Normal}(\mu,\sigma^2)

has a fixed mean \mu, but a random standard deviation

\sigma|\phi \;\sim\; \mathrm{G}(\phi)

following some probability distribution \mathrm{G}(\phi), then the marginal distribution of X,

X|\mu,\phi

is a mixture distribution (Lindsay, 1995; Seidel, 2010).

The mixture distribution's probability density function etc. result from integration and often are not available in analytical form. The normalmixture() function approximates density, distribution function and quantile function to some pre-set accuracy by a discrete mixture of normal distributions based on a finite number of \sigma values using the ‘DIRECT’ algorithm (Roever and Friede, 2017).

Either the “density” or “cdf” argument needs to be supplied. If only “density” is given, then the CDF is derived via integration, if only “cdf” is supplied, then the density function is not necessary.

In meta-analysis applications, mixture distributions arise e.g. in the context of prior predictive distributions. Assuming that a study-specific effect \theta_i a priori is distributed as

\theta_i|\mu,\tau \;\sim\; \mathrm{Normal}(\mu,\tau^2)

with a prior distribution for the heterogeneity \tau,

\tau|\phi \;\sim\; \mathrm{G}(\phi)

yields a setup completely analogous to the above one.

Since it is sometimes hard to judge what constitutes a sensible heterogeneity prior, it is often useful to inspect the implications of certain settings in terms of the corresponding prior predictive distribution of

\theta_i|\mu,\phi

indicating the a priori implied variation between studies due to heterogeneity alone based on a certain prior setup (Spiegelhalter et al., 2004, Sec. 5.7.3). Some examples using different heterogeneity priors are illustrated below.

Value

A list containing the following elements:

`delta`, `epsilon`	The supplied design parameters.
`mu`	the normal mean.
`bins`	the number of bins used.
`support`	the matrix containing lower, upper and reference points for each bin and its associated probability.
`density`	the mixture's density `function(x)`.
`cdf`	the mixture's cumulative distribution `function(x)` (CDF).
`quantile`	the mixture's quantile `function(p)` (inverse CDF).
`mixing.density`	the mixing distribution's density `function()` (if supplied).
`mixing.cdf`	the mixing distribution's cumulative distribution `function()`.

Author(s)

Christian Roever christian.roever@med.uni-goettingen.de

References

B.G. Lindsay. Mixture models: theory, geometry and applications. Vol. 5 of CBMS Regional Conference Series in Probability and Statistics, Institute of Mathematical Statistics, Hayward, CA, USA, 1995.

C. Roever, T. Friede. Discrete approximation of a mixture distribution via restricted divergence. Journal of Computational and Graphical Statistics, 26(1):217-222, 2017. doi:10.1080/10618600.2016.1276840.

C. Roever. Bayesian random-effects meta-analysis using the bayesmeta R package. Journal of Statistical Software, 93(6):1-51, 2020. doi:10.18637/jss.v093.i06.

C. Roever, R. Bender, S. Dias, C.H. Schmid, H. Schmidli, S. Sturtz, S. Weber, T. Friede. On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis. Research Synthesis Methods, 12(4):448-474, 2021. doi:10.1002/jrsm.1475.

W.E. Seidel. Mixture models. In: M. Lovric (ed.), International Encyclopedia of Statistical Science, Springer, Heidelberg, pp. 827-829, 2010.

D.J. Spiegelhalter, K.R. Abrams, J.P.Myles. Bayesian approaches to clinical trials and health-care evaluation. Wiley & Sons, 2004.

Examples

##################################################################
# compare half-normal mixing distributions with different scales:
nm05 <- normalmixture(cdf=function(x){phalfnormal(x, scale=0.5)})
nm10 <- normalmixture(cdf=function(x){phalfnormal(x, scale=1.0)})
# (this corresponds to the case of assuming a half-normal prior
# for the heterogeneity tau)

# check the structure of the returned object:
str(nm05)

# show density functions:
# (these would be the marginal (prior predictive) distributions
# of study-specific effects theta[i])
x <- seq(-1, 3, by=0.01)
plot(x, nm05$density(x), type="l", col="blue", ylab="density")
lines(x, nm10$density(x), col="red")
abline(h=0, v=0, col="grey")

# show cumulative distributions:
plot(x, nm05$cdf(x), type="l", col="blue", ylab="CDF")
lines(x, nm10$cdf(x), col="red")
abline(h=0:1, v=0, col="grey")

# determine 5 percent and 95 percent quantiles:
rbind("HN(0.5)"=nm05$quantile(c(0.05,0.95)),
      "HN(1.0)"=nm10$quantile(c(0.05,0.95)))


##################################################################
# compare different mixing distributions
# (half-normal, half-Cauchy, exponential and Lomax):
nmHN <- normalmixture(cdf=function(x){phalfnormal(x, scale=0.5)})
nmHC <- normalmixture(cdf=function(x){phalfcauchy(x, scale=0.5)})
nmE  <- normalmixture(cdf=function(x){pexp(x, rate=2)})
nmL  <- normalmixture(cdf=function(x){plomax(x, shape=4, scale=2)})

# show densities (logarithmic y-axis):
x <- seq(-1, 3, by=0.01)
plot(x,  nmHN$density(x), col="green",  type="l", ylab="density", ylim=c(0.005, 6.5), log="y")
lines(x, nmHC$density(x), col="red")
lines(x, nmE$density(x),  col="blue")
lines(x, nmL$density(x),  col="cyan")
abline(v=0, col="grey")

# show CDFs:
plot(x,  nmHN$cdf(x), col="green",  type="l", ylab="CDF", ylim=c(0,1))
lines(x, nmHC$cdf(x), col="red")
lines(x, nmE$cdf(x),  col="blue")
lines(x, nmL$cdf(x),  col="cyan")
abline(h=0:1, v=0, col="grey")
# add "exponential" x-axis at top:
axis(3, at=log(c(0.5,1,2,5,10,20)), lab=c(0.5,1,2,5,10,20))
# show 95 percent quantiles:
abline(h=0.95, col="grey", lty="dashed")
abline(v=nmHN$quantile(0.95), col="green", lty="dashed")
abline(v=nmHC$quantile(0.95), col="red", lty="dashed")
abline(v=nmE$quantile(0.95),  col="blue", lty="dashed")
abline(v=nmL$quantile(0.95),  col="cyan", lty="dashed")
rbind("half-normal(0.5)"=nmHN$quantile(0.95),
      "half-Cauchy(0.5)"=nmHC$quantile(0.95),
      "exponential(2.0)"=nmE$quantile(0.95),
      "Lomax(4,2)"      =nmL$quantile(0.95))


#####################################################################
# a normal mixture distribution example where the solution
# is actually known analytically: the Student-t distribution.
# If  Y|sigma ~ N(0,sigma^2),  where  sigma = sqrt(k/X)
# and  X|k ~ Chi^2(df=k),
# then the marginal  Y|k  is Student-t with k degrees of freedom.

# define CDF of sigma:
CDF <- function(sigma, df){pchisq(df/sigma^2, df=df, lower.tail=FALSE)}

# numerically approximate normal mixture (with k=5 d.f.):
k <- 5
nmT1 <- normalmixture(cdf=function(x){CDF(x, df=k)})
# in addition also try a more accurate approximation:
nmT2 <- normalmixture(cdf=function(x){CDF(x, df=k)}, delta=0.001, epsilon=0.00001)
# check: how many grid points were required?
nmT1$bins
nmT2$bins

# show true and approximate densities:
x <- seq(-2,10,le=400)
plot(x, dt(x, df=k), type="l")
abline(h=0, v=0, col="grey")
lines(x, nmT1$density(x), col="red", lty="dashed")
lines(x, nmT2$density(x), col="blue", lty="dotted")

# show ratios of true and approximate densities:
plot(x, nmT1$density(x)/dt(x, df=k), col="red",
     type="l", log="y", ylab="density ratio")
abline(h=1, v=0, col="grey")
lines(x, nmT2$density(x)/dt(x, df=k), col="blue")

[Package bayesmeta version 3.4 Index]