AR.Sim {AR}R Documentation

Graphical Visualization for Accept-Reject Method

Description

Package AR provides a graphical presentation for Accept-Reject method by drawing three figures which their explanations are as follow:

Explanation of Figure 1:

Moreover, even when the Rejection Accept-Reject method is applied, it is always hard to optimize the constant cc for the likelihood ratio. Although, the algorithm works with a bigger constant cc (with respect to optimal/minimum possible cc), but increasing cc cause high rejection rate and the algorithm can be very in-efficient. The first figure show three curves fX(x)f_X(x), fY(x)f_Y(x) and fX(x)fY(x)\frac{f_X(x)}{f_Y(x)}. Moreover, the optimum cc (minimum possible cc, such that fX(x)fY(x)c\frac{f_X(x)}{f_Y(x)} \leq c) calculated as the maximum height of the curve fX(x)fY(x)\frac{f_X(x)}{f_Y(x)}, which is also shown on the first figure.

Explanation of Figure 2:

To visualize the motivation behind the Acceptance-Rejection method, imagine graphing curve fX(y)c fY(y)\frac{f_X(y)}{c \ f_Y(y)} onto a large rectangular board and throwing darts at it. Assume that the xx-positions of these darts/points are uniformly distributed around the board and the distribution of yy-positions of them are based on YY distribution. Now, remove all of the darts/points that are outside the area under the curve fX(y)c fY(y)\frac{f_X(y)}{c \ f_Y(y)}. The xx-positions of the remaining darts will be distributed according to the random variable's density of XX within the area under the curve. Since, it can be prove that

P[Yy  UfX(Y)c fY(Y)]=P(Xx). P\left[ Y\leq y \ | \ U \leq \frac{f_X(Y)}{c \ f_Y(Y)} \right] = P\left( X \leq x \right) .

Explanation of Figure 3:

For another graphical presentation of the motivation behind the Acceptance-Rejection method, assumes that the considered board (which is presented in explanation of Figure 2) is not necessarily rectangular but is shaped according to some distribution that we know how to generate sample from it (c.fY(y)c . f_Y(y)). Therefore, if yy-positions of random points/darts be equal to u.c.fY(y)u.c.f_Y(y), then all darts/points will be land under the curve c.fY(y)c.f_Y(y). The acceptance condition in the Acceptance-Rejection method is

ufX(y)c fY(y),u \leq \frac{f_X(y)}{c \ f_Y(y)},

or equivalently

u.c.fY(y)fX(y),u.c.f_Y(y) \leq f_X(y),

and it means that after omitting the extra/red random darts/points from the board (which are not satisfy in the acceptance condition), the xx-positions of the remaining darts/points will be distributed according to the distribution of XX.

Usage

AR.Sim(n, f_X, Y.dist, Y.dist.par, xlim = c(0, 1), S_X = xlim, Rej.Num = TRUE,
       Rej.Rate = TRUE, Acc.Rate = TRUE)

Arguments

n

The number/length of data which must be generated/simulated from fXf_X density.

f_X

The density fXf_X of interest for simulation (called the target density)

Y.dist

The distribution name of the random variable YY, which used to generate the random data from fYf_Y. Precisely, Y.dist is the name of fYf_Y density which is match with DISTRIB Package. For example, use Y.dist = "norm", when YN(μ,σ2) Y \sim N(\mu, \sigma^2) .

Y.dist.par

A vector of YY distribution parameters with considered ordering in stats package and also is match with DISTRIB Package. For example, use Y.dist.par = c(2,3), when YN(μ=2,σ2=9) Y \sim N(\mu=2, \sigma^2=9) .

xlim

NULL or a numeric vector of length 2; if non-NULL it provides the defaults for c(from, to) and, unless add=TRUE, selects the xx-limits of the available plot. Its default is xlim=c(0,1).

S_X

The support of XX with default S_X = xlim , which is needed for calculating the optimum value of constant cc.

Rej.Num

A logical argument with default TRUE for calculate the number of rejections in Accept-Reject method. If Rej.Num = FALSE, then the number of rejections is not reported.

Rej.Rate

A logical argument with default TRUE for calculate the ratio of rejections in Accept-Reject method (i.e. Rej.Num / n). If Rej.Rate = FALSE, then the ratio of rejections is not reported.

Acc.Rate

A logical argument with default TRUE for calculate the ratio of acceptances in Accept-Reject method (i.e. 1 - Rej.Rate ). If Acc.Rate = FALSE, then the ratio of acceptances is not reported.

Value

A vector of generated/simulated data from random variable XX with length nn.

Optimum value for cc, i.e. the minimum possible value for cc.

References

Robert, C.P., Casella, G., Introducing Monte Carlo Methods with R, New York: Springer (2010).

Wikipedia, the free encyclopedia, Rejection sampling, https://en.wikipedia.org/wiki/Rejection_sampling

Examples

# Example 1: 
data = AR.Sim( n = 150, 
         f_X = function(y){dbeta(y,2.7,6.3)},
         Y.dist = "unif", Y.dist.par = c(0,1),
         Rej.Num = TRUE,
         Rej.Rate = TRUE,
         Acc.Rate = FALSE
         ) 

# QQ-plot
q <- qbeta(ppoints(100), 2.7, 6.3)
qqplot(q, data, cex=0.6, xlab="Quantiles of Beta(2.7,6.3)", 
       ylab="Empirical Quantiles of simulated data")
abline(0, 1, col=2)


# ------------------------------------------------------
# Example 2: From Page 54 of (Robert and Casella, 2009)
f_X = function(x) dbeta(x,2.7,6.3)
Simulation1 <- AR.Sim(n=300, f_X, Y.dist = "unif", Y.dist.par = c(0,1)) 
Simulation2 <- AR.Sim(n=2000, f_X, Y.dist="beta", Y.dist.par=c(2,6) ) 
Simulation3 <- AR.Sim(n=1000, f_X, Y.dist="beta", Y.dist.par=c(1.5,3.7) ) 
Simulation4 <- AR.Sim(n=250, f_X, Y.dist="norm", Y.dist.par=c(.5,.2) ) 
Simulation5 <- AR.Sim(n=200,  f_X, Y.dist="exp", Y.dist.par=3 ) 
Simulation6 <- AR.Sim( 400 ,  f_X, Y.dist="gamma", Y.dist.par=c(2,5) ) 

hist(Simulation1, prob=TRUE)#, col="gray20")
hist(Simulation2, prob=TRUE, add=TRUE, col="gray35")
hist(Simulation3, prob=TRUE, add=TRUE, col="gray60")
hist(Simulation4, prob=TRUE, add=TRUE, col="gray75")
hist(Simulation5, prob=TRUE, add=TRUE, col="gray85")
hist(Simulation6, prob=TRUE, add=TRUE, col="gray100")
curve(f_X(x), add=TRUE, col=2, lty=2, lwd=3)

#compare empirical and theoretical percentiles:
p <- seq(.1, .9, .1)
Qhat1 <- quantile(Simulation1, p) #Empirical quantiles of simulated sample 
Qhat2 <- quantile(Simulation2, p) #Empirical quantiles of simulated sample 
Qhat3 <- quantile(Simulation3, p) #Empirical quantiles of simulated sample 
Qhat4 <- quantile(Simulation4, p) #Empirical quantiles of simulated sample 
Qhat5 <- quantile(Simulation5, p) #Empirical quantiles of simulated sample 
Qhat6 <- quantile(Simulation6, p) #Empirical quantiles of simulated sample 
Q <- qbeta(p, 2.7, 6.3)    #Theoretical quantiles of Be(2.7,6.3)
round( rbind(Q, Qhat1, Qhat2, Qhat3, Qhat4, Qhat5, Qhat6), 3)

# Compute p-value of Kolmogorov-Smirnov test:
ks.test(Simulation1, "pbeta", 2.7, 6.3)$p.value
ks.test(Simulation2, "pbeta", 2.7, 6.3)$p.value
ks.test(Simulation3, "pbeta", 2.7, 6.3)$p.value
ks.test(Simulation4, "pbeta", 2.7, 6.3)$p.value
ks.test(Simulation5, "pbeta", 2.7, 6.3)$p.value
ks.test(Simulation6, "pbeta", 2.7, 6.3)$p.value


# ------------------------------------------------------
# Example 3: Simulate Truncated N(5,2^2) at l=3 and r=14 in left and rigth sides, respectively. 

mu = 5
sigma = 2
l = 3
r = 14
n = 400
f_X = function(x)  dnorm(x,mu,sigma) * 
                   as.integer(l<x & x<r) / (pnorm(r,mu,sigma)-pnorm(l,mu,sigma)) 

Sim1 <- AR.Sim(n, f_X, S_X=c(l,r), Y.dist="norm", Y.dist.par=c(5,4), xlim=c(l-1,r+1) ) 
head(Sim1, 15)
hist(Sim1, prob=TRUE, col="lightgreen")
curve(f_X(x), add=TRUE, col=2, lty=2, lwd=3) # Truncated pdf of N(5,2^2) at l and r
c2 = 1 / (pnorm(r,mu,sigma)-pnorm(l,mu,sigma)) ; c2 #See page 15 jozve


[Package AR version 1.1 Index]