fit_two_layer {deepgp}  R Documentation 
Conducts MCMC sampling of hyperparameters and hidden layer
w
for a two layer deep GP. Separate length scale
parameters theta_w
and theta_y
govern the correlation
strength of the hidden layer and outer layer respectively. Nugget
parameter g
governs noise on the outer layer. In Matern
covariance, v
governs smoothness.
fit_two_layer(
x,
y,
D = ifelse(is.matrix(x), ncol(x), 1),
nmcmc = 10000,
verb = TRUE,
w_0 = NULL,
g_0 = 0.01,
theta_y_0 = 0.1,
theta_w_0 = 0.1,
true_g = NULL,
settings = NULL,
cov = c("matern", "exp2"),
v = 2.5,
vecchia = FALSE,
m = min(25, length(y)  1)
)
x 
vector or matrix of input locations 
y 
vector of response values 
D 
integer designating dimension of hidden layer, defaults to
dimension of 
nmcmc 
number of MCMC iterations 
verb 
logical indicating whether to print iteration progress 
w_0 
initial value for hidden layer 
g_0 
initial value for 
theta_y_0 
initial value for 
theta_w_0 
initial value for 
true_g 
if true nugget is known it may be specified here (set to a small value to make fit deterministic). Note  values that are too small may cause numerical issues in matrix inversions. 
settings 
hyperparameters for proposals and priors (see details) 
cov 
covariance kernel, either Matern or squared exponential
( 
v 
Matern smoothness parameter (only used if 
vecchia 
logical indicating whether to use Vecchia approximation 
m 
size of Vecchia conditioning sets (only used if

Maps inputs x
through hidden layer w
to outputs
y
. Conducts sampling of the hidden layer using Elliptical
Slice sampling. Utilizes Metropolis Hastings sampling of the length
scale and nugget parameters with proposals and priors controlled by
settings
. When true_g
is set to a specific value, the
nugget is not estimated. When vecchia = TRUE
, all calculations
leverage the Vecchia approximation with specified conditioning set size
m
. Vecchia approximation is only implemented for
cov = "matern"
.
Proposals for g
, theta_y
, and
theta_w
follow a uniform sliding window scheme, e.g.
g_star < runif(1, l * g_t / u, u * g_t / l)
,
with defaults l = 1
and u = 2
provided in settings
.
To adjust these, set settings = list(l = new_l, u = new_u)
.
Priors on g
, theta_y
, and theta_w
follow Gamma
distributions with shape parameters (alpha
) and rate parameters
(beta
) controlled within the settings
list object.
Defaults are
settings$alpha$g < 1.5
settings$beta$g < 3.9
settings$alpha$theta_w < 1.5
settings$beta$theta_w < 3.9 / 4
settings$alpha$theta_y < 1.5
settings$beta$theta_y < 3.9 / 6
These priors are designed for x
scaled to
[0, 1] and y
scaled to have mean 0 and variance 1. These may be
adjusted using the settings
input.
When w_0 = NULL
, the hidden layer is initialized at x
(i.e. the identity mapping). The default prior mean of the hidden layer
is zero, but may be adjusted to x
using
settings = list(w_prior_mean = x)
. If w_0
is of dimension
nrow(x)  1
by D
, the final row is predicted using kriging.
This is helpful in sequential design when adding a new input location
and starting the MCMC at the place where the previous MCMC left off.
The output object of class dgp2
or dgp2vec
is designed for
use with continue
, trim
, and predict
.
a list of the S3 class dgp2
or dgp2vec
with elements:
x
: copy of input matrix
y
: copy of response vector
nmcmc
: number of MCMC iterations
settings
: copy of proposal/prior settings
v
: copy of Matern smoothness parameter (v = 999
indicates cov = "exp2"
)
g
: vector of MCMC samples for g
theta_y
: vector of MCMC samples for theta_y
(length
scale of outer layer)
theta_w
: matrix of MCMC samples for theta_w
(length
scale of inner layer)
tau2
: vector of MLE estimates for tau2
(scale
parameter of outer layer)
w
: list of MCMC samples for hidden layer w
time
: computation time in seconds
Sauer, A, RB Gramacy, and D Higdon. 2020. "Active Learning for Deep Gaussian
Process Surrogates." Technometrics, to appear; arXiv:2012.08015.
Sauer, A, A Cooper, and RB Gramacy. 2022. "Vecchiaapproximated Deep Gaussian
Processes for Computer Experiments." preprint on arXiv:2204.02904
Murray, I, RP Adams, and D MacKay. 2010. "Elliptical slice sampling."
Journal of Machine Learning Research 9, 541548.
# Examples of realworld implementations are available at:
# https://bitbucket.org/gramacylab/deepgpex/
# G function (https://www.sfu.ca/~ssurjano/gfunc.html)
f < function(xx, a = (c(1:length(xx))  1) / 2) {
new1 < abs(4 * xx  2) + a
new2 < 1 + a
prod < prod(new1 / new2)
return((prod  1) / 0.86)
}
# Training data
d < 1
n < 20
x < matrix(runif(n * d), ncol = d)
y < apply(x, 1, f)
# Testing data
n_test < 100
xx < matrix(runif(n_test * d), ncol = d)
yy < apply(xx, 1, f)
plot(xx[order(xx)], yy[order(xx)], type = "l")
points(x, y, col = 2)
# Example 1: full model (nugget estimated, using continue)
fit < fit_two_layer(x, y, nmcmc = 1000)
plot(fit)
fit < continue(fit, 1000)
plot(fit)
fit < trim(fit, 1000, 2)
fit < predict(fit, xx, cores = 1)
plot(fit, hidden = TRUE)
# Example 2: Vecchia approximated model
# (Vecchia approximation is faster for larger data sizes)
fit < fit_two_layer(x, y, nmcmc = 2000, vecchia = TRUE, m = 10)
plot(fit)
fit < trim(fit, 1000, 2)
fit < predict(fit, xx, cores = 1)
plot(fit, hidden = TRUE)
# Example 3: Vecchia approximated model (reapproximated after burnin)
fit < fit_two_layer(x, y, nmcmc = 1000, vecchia = TRUE, m = 10)
fit < continue(fit, 1000, re_approx = TRUE)
plot(fit)
fit < trim(fit, 1000, 2)
fit < predict(fit, xx, cores = 1)
plot(fit, hidden = TRUE)