R: Integrated Mean-Squared (prediction) Error for Sequential...

IMSE {deepgp}

R Documentation

Integrated Mean-Squared (prediction) Error for Sequential Design

Description

Acts on a gp, dgp2, or dgp3 object. Current version requires squared exponential covariance (cov = "exp2"). Calculates IMSE over the input locations x_new. Optionally utilizes SNOW parallelization. User should select the point with the lowest IMSE to add to the design.

Usage

IMSE(object, x_new, cores)

## S3 method for class 'gp'
IMSE(object, x_new = NULL, cores = 1)

## S3 method for class 'dgp2'
IMSE(object, x_new = NULL, cores = 1)

## S3 method for class 'dgp3'
IMSE(object, x_new = NULL, cores = 1)

Arguments

`object`	object of class `gp`, `dgp2`, or `dgp3`
`x_new`	matrix of possible input locations, if object has been run through `predict` the previously stored `x_new` is used
`cores`	number of cores to utilize in parallel, by default no parallelization is used

Details

Not yet implemented for Vecchia-approximated fits or Matern kernels.

All iterations in the object are used in the calculation, so samples should be burned-in. Thinning the samples using trim will speed up computation. This function may be used in two ways:

Option 1: called on an object with only MCMC iterations, in which case x_new must be specified
Option 2: called on an object that has been predicted over, in which case the x_new from predict is used

In Option 2, it is recommended to set store_latent = TRUE for dgp2 and dgp3 objects so latent mappings do not have to be re-calculated. Through predict, the user may specify a mean mapping (mean_map = TRUE) or a full sample from the MVN distribution over w_new (mean_map = FALSE). When the object has not yet been predicted over (Option 1), the mean mapping is used.

SNOW parallelization reduces computation time but requires more memory storage.

Value

list with elements:

value: vector of IMSE values, indices correspond to x_new
time: computation time in seconds

References

Sauer, A., Gramacy, R.B., & Higdon, D. (2023). Active learning for deep Gaussian process surrogates. *Technometrics, 65,* 4-18. arXiv:2012.08015

Binois, M, J Huang, RB Gramacy, and M Ludkovski. 2019. "Replication or Exploration? Sequential Design for Stochastic Simulation Experiments." Technometrics 61, 7-23. Taylor & Francis. doi:10.1080/00401706.2018.1469433

Examples

# --------------------------------------------------------
# Example 1: toy step function, runs in less than 5 seconds
# --------------------------------------------------------

f <- function(x) {
    if (x <= 0.4) return(-1)
    if (x >= 0.6) return(1)
    if (x > 0.4 & x < 0.6) return(10*(x-0.5))
}

x <- seq(0.05, 0.95, length = 7)
y <- sapply(x, f)
x_new <- seq(0, 1, length = 100)

# Fit model and calculate IMSE
fit <- fit_one_layer(x, y, nmcmc = 100, cov = "exp2")
fit <- trim(fit, 50)
fit <- predict(fit, x_new, cores = 1, store_latent = TRUE)
imse <- IMSE(fit)


# --------------------------------------------------------
# Example 2: Higdon function
# --------------------------------------------------------

f <- function(x) {
    i <- which(x <= 0.48)
    x[i] <- 2 * sin(pi * x[i] * 4) + 0.4 * cos(pi * x[i] * 16)
    x[-i] <- 2 * x[-i] - 1
    return(x)
}

# Training data
x <- seq(0, 1, length = 30)
y <- f(x) + rnorm(30, 0, 0.05)

# Testing data
xx <- seq(0, 1, length = 100)
yy <- f(xx)

plot(xx, yy, type = "l")
points(x, y, col = 2)

# Conduct MCMC (can replace fit_three_layer with fit_one_layer/fit_two_layer)
fit <- fit_three_layer(x, y, D = 1, nmcmc = 2000, cov = "exp2")
plot(fit)
fit <- trim(fit, 1000, 2)

# Option 1 - calculate IMSE from only MCMC iterations
imse <- IMSE(fit, xx)

# Option 2 - calculate IMSE after predictions
fit <- predict(fit, xx, cores = 1, store_latent = TRUE)
imse <- IMSE(fit)

# Visualize fit
plot(fit)
par(new = TRUE) # overlay IMSE
plot(xx, imse$value, col = 2, type = 'l', lty = 2, 
     axes = FALSE, xlab = '', ylab = '')

# Select next design point
x_new <- xx[which.min(imse$value)]

[Package deepgp version 1.1.2 Index]