bfs {hetGP} | R Documentation |
Bayes Factor Data
Description
Data from a Bayes factor MCMC-based simulation experiment comparing Student-t to Gaussian errors in an RJ-based Laplace prior Bayesian linear regession setting
Usage
data(ato)
Format
Calling data(bfs)
causes the following objects to be loaded into the namespace.
bfs.exp
20x11
data.frame
whose first column is\theta
, indicating the mean parameter of an exponential distribution encoding the prior of the Student-t degrees of freedom parameter\nu
. The remaining ten columns comprise of Bayes factor evaluations under that settingbfs.gamma
80x7
data.frame
whose first two columns are\beta
and\alpha
, indicating the second and first parameters to a Gamma distribution encoding the prior of the Student-t degrees of freedom parameters\nu
. The remaining five columns comprise of Bayes factor evaluations under those settings
Details
Gramacy & Pantaleo (2010), Sections 3-3-3.4, describe an experiment
involving Bayes factor (BF) calculations to determine if data are
leptokurtic (Student-t errors) or not (simply Gaussian) as a function of the
prior parameterization on the Student-t degrees of freedom parameter
\nu
. Franck & Gramacy (2018) created a grid of hyperparameter
values in \theta
describing the mean of an Exponential
distribution, evenly spaced in \log_{10}
space from
10^(-3)
to 10^6
spanning “solidly Student-t” (even
Cauchy) to “essentially Gaussian” in terms of the mean of the prior
over \nu
. For each \theta
setting on the grid they
ran the Reversible Jump (RJ) MCMC to approximate the BF of Student-t over Gaussian
by feeding in sample likelihood evaluations provided by monomvn's
blasso
to compute the BF. In order to understand the
Monte Carlo variability in those calculations, ten replicates of the BFs
under each hyperparameter setting were collected. These data are provided
in bfs.exp
.
A similar, larger experiment was provided with \nu
under a Gamma
prior with parameters \alpha
and \beta \equiv \theta
. In this higher dimensional space, a Latin hypercube sample
of size eighty was created, and five replicates of BFs were recorded.
These data are provided in bfs.gamma
.
The examples below involve mleHetTP
fits (Chung, et al., 2018)
to these data and a visualization of the predictive surfaces thus obtained.
The code here follows an example provided, with more detail, in
vignette("hetGP")
Note
For code showing how these BFs were calculated, see supplementary material from Franck & Gramacy (2018)
Author(s)
Mickael Binois, mbinois@mcs.anl.gov, and Robert B. Gramacy, rbg@vt.edu
References
Franck CT, Gramacy RB (2018). Assessing Bayes factor surfaces using interactive visualization and computer surrogate modeling. Preprint available on arXiv:1809.05580.
Gramacy RB (2017). monomvn: Estimation for Multivariate Normal and Student-t Data with Monotone Missingness. R package version 1.9-7, https://CRAN.R-project.org/package=monomvn.
R.B. Gramacy and E. Pantaleo (2010). Shrinkage regression for multivariate inference with missing data, and an application to portfolio balancing. Bayesian Analysis 5(2), 237-262. Preprint available on arXiv:0907.2135
Chung M, Binois M, Gramacy RB, Moquin DJ, Smith AP, Smith AM (2018). Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes. SIAM Journal on Scientific Computing, 41(4), 2212-2238. Preprint available on arXiv:1802.00852.
See Also
ato
, sirEval
, mleHetTP
,
vignette("hetGP")
, blasso
Examples
data(bfs)
##
## Exponential version first
##
thetas <- matrix(bfs.exp$theta, ncol=1)
bfs <- as.matrix(t(bfs.exp[,-1]))
## the data are heavy tailed, so t-errors help
bfs1 <- mleHetTP(X=list(X0=log10(thetas), Z0=colMeans(log(bfs)),
mult=rep(nrow(bfs), ncol(bfs))), Z=log(as.numeric(bfs)), lower=10^(-4),
upper=5, covtype="Matern5_2")
## predictions on a grid in 1d
dx <- seq(0,1,length=100)
dx <- 10^(dx*4 - 3)
p <- predict(bfs1, matrix(log10(dx), ncol=1))
## visualization
matplot(log10(thetas), t(log(bfs)), col=1, pch=21, ylab="log(bf)",
main="Bayes factor surface")
lines(log10(dx), p$mean, lwd=2, col=2)
lines(log10(dx), p$mean + 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2)
lines(log10(dx), p$mean - 2*sqrt(p$sd2 + p$nugs), col=2, lty=2, lwd=2)
legend("topleft", c("hetTP mean", "hetTP interval"), lwd=2, lty=1:2, col=2)
##
## Now Gamma version
##
D <- as.matrix(bfs.gamma[,1:2])
bfs <- as.matrix(t(bfs.gamma[,-(1:2)]))
## fitting in 2fd
bfs2 <- mleHetTP(X=list(X0=log10(D), Z0=colMeans(log(bfs)),
mult=rep(nrow(bfs), ncol(bfs))), Z = log(as.numeric(bfs)),
lower = rep(10^(-4), 2), upper = rep(5, 2), covtype = "Matern5_2",
maxit=100000)
## predictions on a grid in 2d
dx <- seq(0,1,length=100)
dx <- 10^(dx*4 - 3)
DD <- as.matrix(expand.grid(dx, dx))
p <- predict(bfs2, log10(DD))
## visualization via image-contour plots
par(mfrow=c(1,2))
mbfs <- colMeans(bfs)
image(log10(dx), log10(dx), t(matrix(p$mean, ncol=length(dx))),
col=heat.colors(128), xlab="log10 alpha", ylab="log10 beta",
main="mean log BF")
text(log10(D[,2]), log10(D[,1]), signif(log(mbfs), 2), cex=0.5)
contour(log10(dx), log10(dx),t(matrix(p$mean, ncol=length(dx))),
levels=c(-5,-3,-1,0,1,3,5), add=TRUE, col=4)
image(log10(dx), log10(dx), t(matrix(sqrt(p$sd2 + p$nugs),
ncol=length(dx))), col=heat.colors(128), xlab="log10 alpha",
ylab="log10 beta", main="sd log BF")
text(log10(D[,2]), log10(D[,1]), signif(apply(log(bfs), 2, sd), 2),
cex=0.5)