laGP {laGP} | R Documentation |
Localized Approximate GP Prediction At a Single Input Location
Description
Build a sub-design of X
of size end
, and infer parameters,
for approximate Gaussian process prediction at reference location(s)
Xref
. Return the moments of those predictive equations, and indices
into the local design
Usage
laGP(Xref, start, end, X, Z, d = NULL, g = 1/10000,
method = c("alc", "alcopt", "alcray", "mspe", "nn", "fish"), Xi.ret = TRUE,
close = min((1000+end)*if(method[1] %in% c("alcray", "alcopt")) 10 else 1, nrow(X)),
alc.gpu = FALSE, numstart = if(method[1] == "alcray") ncol(X) else 1,
rect = NULL, lite = TRUE, verb = 0)
laGP.R(Xref, start, end, X, Z, d = NULL, g = 1/10000,
method = c("alc", "alcopt", "alcray", "mspe", "nn", "fish"),
Xi.ret = TRUE, pall = FALSE,
close = min((1000+end)*if(method[1] %in% c("alcray", "alcopt")) 10 else 1, nrow(X)),
parallel = c("none", "omp", "gpu"),
numstart = if(method[1] == "alcray") ncol(X) else 1,
rect = NULL, lite = TRUE, verb = 0)
laGPsep(Xref, start, end, X, Z, d = NULL, g = 1/10000,
method = c("alc", "alcopt", "alcray", "nn"), Xi.ret = TRUE,
close = min((1000+end)*if(method[1] %in% c("alcray", "alcopt")) 10 else 1, nrow(X)),
alc.gpu = FALSE, numstart = if(method[1] == "alcray") ncol(X) else 1,
rect = NULL, lite = TRUE, verb=0)
laGPsep.R(Xref, start, end, X, Z, d = NULL, g = 1/10000,
method = c("alc", "alcopt", "alcray", "nn"),
Xi.ret = TRUE, pall = FALSE,
close = min((1000+end)*if(method[1] %in% c("alcray", "alcopt")) 10 else 1, nrow(X)),
parallel = c("none", "omp", "gpu"),
numstart = if(method[1] == "alcray") ncol(X) else 1,
rect = NULL, lite = TRUE, verb = 0)
Arguments
Xref |
a vector of length |
start |
the number of Nearest Neighbor (NN) locations for initialization; must
specify |
end |
the total size of the local designs; must have |
X |
a |
Z |
a vector of responses/dependent values with |
d |
a prior or initial setting for the lengthscale
parameter for a Gaussian correlation function; a (default)
|
g |
a prior or initial setting for the nugget parameter; a
|
method |
Specifies the method by which |
Xi.ret |
A scalar logical indicating whether or not a vector of indices
into |
pall |
a scalar logical (for |
close |
a non-negative integer |
alc.gpu |
a scalar |
parallel |
a switch indicating if any parallel calculation of
the criteria is desired. Currently parallelization at this level is only
provided for option |
numstart |
a scalar integer indicating the number of rays for each
greedy search when |
rect |
an optional |
lite |
Similar to the |
verb |
a non-negative integer specifying the verbosity level; |
Details
A sub-design of X
of size end
is built-up according to
the criteria prescribed by the method
and then used to predict at
Xref
. The first start
locations are NNs in order to
initialize the first GP, via newGP
or newGPsep
,
and thereby initialize the
search. Each subsequent addition is found via the chosen criterion
(method
), and the GP fit is updated via updateGP
or updateGPsep
The runtime is cubic in end
, although
the multiplicative “constant” out front depends on the
method
chosen, the size of the design X
, and
close
. The "alcray"
method has a smaller constant
since it does not search over all candidates exhaustively.
After building the sub-design, local MLE/MAP lengthscale (and/or
nugget) parameters are estimated, depending on the d
and
g
arguments supplied. This is facilitated by calls to
mleGP
or jmleGP
.
Finally predGP
is called on the resulting local GP
model, and the parameters of the resulting Student-t distribution(s)
are returned. Unless Xi.ret = FALSE
, the indices of the
local design are also returned.
laGP.R
and laGPsep.R
are a prototype R-only version for
debugging and transparency purposes. They are slower than
laGP
and laGPsep
, which are primarily in C, and may not
provide identical output in all cases due to differing library implementations
used as subroutines; see note below for an example. laGP.R
and other
.R
functions in the package may be useful for developing new programs
that involve similar subroutines. The current version of laGP.R
allows OpenMP and/or GPU parallelization of the criteria (method
) if
the package is compiled with the appropriate flags. See README/INSTALL in
the package source for more information. For algorithmic details, see
Gramacy, Niemi, & Weiss (2014)
Value
The output is a list
with the following components.
mean |
a vector of predictive means of length |
s2 |
a vector of Student-t scale parameters of length
|
df |
a Student-t degrees of freedom scalar (applies to all
|
llik |
a scalar indicating the maximized log likelihood or log posterior probability of the data/parameter(s) under the chosen sub-design; provided up to an additive constant |
time |
a scalar giving the passage of wall-clock time elapsed for (substantive parts of) the calculation |
method |
a copy of the |
d |
a full-list version of the |
g |
a full-list version of the |
mle |
if |
Xi |
when |
close |
a copy of the input argument |
Note
laGPsep
provides the same functionality as laGP
but deploys
a separable covariance function. However criteria (method
s) EFI and
MSPE are not supported. This is considered “beta” functionality
at this time.
Note that using method="NN"
gives the same result as specifying
start=end
, however at some extra computational expense.
Handling multiple reference locations
(nrow(Xref) > 1
) is “beta” functionality. In this case
the initial start
locations are chosen by applying NN to the
average distances to all Xref
locations. Using
method="alcopt"
causes exhaustive search to be approximated by
a continuous analog via closed form derivatives.
See alcoptGP
for more details. Although the approximation
provided has a spirit similar to method="alcray"
, in that
both methods are intended to offer a thrifty alternative,
method="alcray"
is not applicable when nrow(Xref) > 1
.
Differences between the C qsort
function and R's
order
function may cause chosen designs returned from
laGP
and laGP.R
(code and laGPsep
and laGPsep.R
)
to differ when multiple X
values are equidistant to Xref
Author(s)
Robert B. Gramacy rbg@vt.edu and Furong Sun furongs@vt.edu
References
Gramacy, R. B. (2020) Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Boca Raton, Florida: Chapman Hall/CRC. (See Chapter 9.) https://bobby.gramacy.com/surrogates/
F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; https://arxiv.org/abs/1712.00182
R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via
Local Approximate Gaussian Processes in R., Journal of Statistical
Software, 72(1), 1-46; doi:10.18637/jss.v072.i01
or see vignette("laGP")
R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction. Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074 https://arxiv.org/abs/1409.0074
R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; https://arxiv.org/abs/1303.0383
R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression. SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; https://arxiv.org/abs/1310.5182
See Also
vignette("laGP")
,
aGP
, newGP
, updateGP
,
predGP
, mleGP
, jmleGP
,
alcGP
, mspeGP
, alcrayGP
,
randLine
## path-based local prediction via laGP
Examples
## examining a particular laGP call from the larger analysis provided
## in the aGP documentation
## A simple 2-d test function used in Gramacy & Apley (2014);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
f2d <- function(x, y=NULL)
{
if(is.null(y)) {
if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2)
y <- x[,2]; x <- x[,1]
}
g <- function(z)
return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
z <- -g(x)*g(y)
}
## build up a design with N=~40K locations
x <- seq(-2, 2, by=0.02)
X <- as.matrix(expand.grid(x, x))
Z <- f2d(X)
## optional first pass of nearest neighbor
Xref <- matrix(c(-1.725, 1.725), nrow=TRUE)
out <- laGP(Xref, 6, 50, X, Z, method="nn")
## second pass via ALC, ALCOPT, MSPE, and ALC-ray respectively,
## conditioned on MLE d-values found above
out2 <- laGP(Xref, 6, 50, X, Z, d=out$mle$d)
# out2.alcopt <- laGP(Xref, 6, 50, X, Z, d=out2$mle$d, method="alcopt")
out2.mspe <- laGP(Xref, 6, 50, X, Z, d=out2$mle$d, method="mspe")
out2.alcray <- laGP(Xref, 6, 50, X, Z, d=out2$mle$d, method="alcray")
## look at the different designs
plot(rbind(X[out2$Xi,], X[out2.mspe$Xi,]), type="n",
xlab="x1", ylab="x2", main="comparing local designs")
points(Xref[1], Xref[2], col=2, cex=0.5)
text(X[out2$Xi,], labels=1:50, cex=0.6)
# text(X[out2.alcopt$Xi,], labels=1:50, cex=0.6, col="forestgreen")
text(X[out2.mspe$Xi,], labels=1:50, cex=0.6, col="blue")
text(X[out2.alcray$Xi,], labels=1:50, cex=0.6, col="red")
legend("right", c("ALC", "ALCOPT", "MSPE", "ALCRAY"),
text.col=c("black", "forestgreen", "blue", "red"), bty="n")
## compare computational time
c(nn=out$time, alc=out2$time, # alcopt=out2.alcopt$time,
mspe=out2.mspe$time, alcray=out2.alcray$time)
## Not run:
## Joint path sampling: a comparison between ALC-ex, ALC-opt and NN
## defining a predictive path
wx <- seq(-0.85, 0.45, length=100)
W <- cbind(wx-0.75, wx^3+0.51)
## three comparators from Sun, et al. (2017)
## larger-than-default "close" argument to capture locations nearby path
p.alc <- laGPsep(W, 6, 100, X, Z, close=10000, lite=FALSE)
p.alcopt <- laGPsep(W, 6, 100, X, Z, method="alcopt", lite=FALSE)
## note that close=10*(1000+end) would be the default for method = "alcopt"
p.nn <- laGPsep(W, 6, 100, X, Z, method="nn", close=10000, lite=FALSE)
## time comparison
c(alc=p.alc$time, alcopt=p.alcopt$time, nn=p.nn$time)
## visualization
plot(W, type="l", xlab="x1", ylab="x2", xlim=c(-2.25,0), ylim=c(-0.75,1.25), lwd=2)
points(X[p.alc$Xi,], col=2, cex=0.6)
lines(W[,1]+0.25, W[,2]-0.25, lwd=2)
points(X[p.nn$Xi,1]+0.25, X[p.nn$Xi,2]-0.25, pch=22, col=3, cex=0.6)
lines(W[,1]-0.25, W[,2]+0.25, lwd=2)
points(X[p.alcopt$Xi,1]-0.25, X[p.alcopt$Xi,2]+0.25, pch=23, col=4, cex=0.6)
legend("bottomright", c("ALC-opt", "ALC-ex", "NN"), pch=c(22, 21, 23), col=c(4,2,3))
## End(Not run)