rejectionSampling {SimDesign} | R Documentation |
Rejection sampling (i.e., accept-reject method)
Description
This function supports the rejection sampling (i.e., accept-reject) approach to drawing values from seemingly difficult (probability) density functions by sampling values from more manageable proxy distributions.
Usage
rejectionSampling(
n,
df,
dg,
rg,
M,
method = "optimize",
interval = NULL,
logfuns = FALSE,
maxM = 1e+05,
parstart = rg(1L),
ESRS_Mstart = 1.0001
)
Arguments
n |
number of samples to draw |
df |
the desired (potentially un-normed) density function to draw
independent samples from. Must be in the form of a |
dg |
the proxy (potentially un-normed) density function to
draw samples from in lieu of drawing samples from |
rg |
the proxy random number generation function, associated with
|
M |
the upper-bound of the ratio of probability density functions to help
minimize the number of discarded draws and define the corresponding
rescaled proposal envelope. When missing, |
method |
when M is missing, the optimization of M is done either by
finding the mode of the log-density values ( |
interval |
when M is missing, for univariate density function draws,
the interval to search within via |
logfuns |
logical; have the |
maxM |
logical; if when optimizing M the value is greater than this cut-off then stop; ampler would likelihood be too efficient, or optimization is failing |
parstart |
starting value vector for optimization of M in multidimensional distributions |
ESRS_Mstart |
starting M value for the ESRS algorithm |
Details
The accept-reject algorithm is a flexible approach to obtaining i.i.d.'s from
a difficult to sample from (probability) density function where either the
transformation method fails or inverse transform method is
difficult to manage. The algorithm does so by sampling from
a more "well-behaved" proxy distribution (with identical support, up to some
proportionality constant M
that reshapes the proposal density
to envelope the target density), and accepts the
draws if they are likely within the target density. Hence, the closer the
shape of dg(x)
is to the desired df(x)
, the more likely the draws
are to be accepted; otherwise, many iterations of the accept-reject algorithm
may be required, which decreases the computational efficiency.
Value
returns a vector or matrix of draws (corresponding to the
output class from rg
) from the desired df
Author(s)
Phil Chalmers rphilip.chalmers@gmail.com
References
Caffo, B. S., Booth, J. G., and Davison, A. C. (2002). Empirical supremum
rejection sampling. Biometrika
, 89, 745–754.
Chalmers, R. P., & Adkins, M. C. (2020). Writing Effective and Reliable
Monte Carlo Simulations with the SimDesign Package.
The Quantitative Methods for Psychology, 16
(4), 248-280.
doi:10.20982/tqmp.16.4.p248
Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics
with Monte Carlo simulation. Journal of Statistics Education, 24
(3),
136-156. doi:10.1080/10691898.2016.1246953
Examples
## Not run:
# Generate X ~ beta(a,b), where a and b are a = 2.7 and b = 6.3,
# and the support is Y ~ Unif(0,1)
dfn <- function(x) dbeta(x, shape1 = 2.7, shape2 = 6.3)
dgn <- function(x) dunif(x, min = 0, max = 1)
rgn <- function(n) runif(n, min = 0, max = 1)
# when df and dg both integrate to 1, acceptance probability = 1/M
M <- rejectionSampling(df=dfn, dg=dgn, rg=rgn)
M
dat <- rejectionSampling(10000, df=dfn, dg=dgn, rg=rgn, M=M)
hist(dat, 100)
hist(rbeta(10000, 2.7, 6.3), 100) # compare
# obtain empirical estimate of M via ESRS method
M <- rejectionSampling(1000, df=dfn, dg=dgn, rg=rgn, method='ESRS')
M
# generate using better support function (here, Y ~ beta(2,6)),
# and use log setup in initial calls (more numerically accurate)
dfn <- function(x) dbeta(x, shape1 = 2.7, shape2 = 6.3, log = TRUE)
dgn <- function(x) dbeta(x, shape1 = 2, shape2 = 6, log = TRUE)
rgn <- function(n) rbeta(n, shape1 = 2, shape2 = 6)
M <- rejectionSampling(df=dfn, dg=dgn, rg=rgn, logfuns=TRUE) # better M
M
## Alternative estimation of M
## M <- rejectionSampling(10000, df=dfn, dg=dgn, rg=rgn, logfuns=TRUE,
## method='ESRS')
dat <- rejectionSampling(10000, df=dfn, dg=dgn, rg=rgn, M=M, logfuns=TRUE)
hist(dat, 100)
#------------------------------------------------------
# sample from wonky (and non-normalized) density function, like below
dfn <- function(x){
ret <- numeric(length(x))
ret[x <= .5] <- dnorm(x[x <= .5])
ret[x > .5] <- dnorm(x[x > .5]) + dchisq(x[x > .5], df = 2)
ret
}
y <- seq(-5,5, length.out = 1000)
plot(y, dfn(y), type = 'l', main = "Function to sample from")
# choose dg/rg functions that have support within the range [-inf, inf]
rgn <- function(n) rnorm(n, sd=4)
dgn <- function(x) dnorm(x, sd=4)
## example M height from above graphic
## (M selected using ESRS to help stochastically avoid local mins)
M <- rejectionSampling(10000, df=dfn, dg=dgn, rg=rgn, method='ESRS')
M
lines(y, dgn(y)*M, lty = 2)
dat <- rejectionSampling(10000, df=dfn, dg=dgn, rg=rgn, M=M)
hist(dat, 100, prob=TRUE)
# true density (normalized)
C <- integrate(dfn, -Inf, Inf)$value
ndfn <- function(x) dfn(x) / C
curve(ndfn, col='red', lwd=2, add=TRUE)
#-----------------------------------------------------
# multivariate distribution
dfn <- function(x) sum(log(c(dnorm(x[1]) + dchisq(x[1], df = 5),
dnorm(x[2], -1, 2))))
rgn <- function(n) c(rnorm(n, sd=3), rnorm(n, sd=3))
dgn <- function(x) sum(log(c(dnorm(x[1], sd=3), dnorm(x[1], sd=3))))
# M <- rejectionSampling(df=dfn, dg=dgn, rg=rgn, logfuns=TRUE)
dat <- rejectionSampling(5000, df=dfn, dg=dgn, rg=rgn, M=4.6, logfuns=TRUE)
hist(dat[,1], 30)
hist(dat[,2], 30)
plot(dat)
## End(Not run)