R: The Fitting Function of 'GPM' Package

Fit {GPM}

R Documentation

The Fitting Function of `GPM` Package

Description

Fits a Gaussian process (GP) to a set of simulation data as described in reference 1. Both the inputs and outputs can be multi-dimensional. The outputs can be noisy in which case it is assumed that the noise is stationary (i.e., its variance is not a function of x).

Usage

Fit(X, Y, CorrType = 'G', Eps = 10^(seq(-1, -12)), AnaGr = NULL, Nopt = 5,TraceIt = 0,
MaxIter = 100, Seed = 1, LowerBound = NULL, UpperBound = NULL, 
StopFlag = 1, Progress = 0, DoParallel = 0, Ncores = NULL)

Arguments

`X`	Matrix containing the training (aka design or input) data points. The rows and columns of `X` denote individual observation settings and input dimension, respectively.
`Y`	Matrix containing the output (aka response) data points. The rows and columns of `Y` denote individual observation responses and output dimension, respectively.
`CorrType`	The type of the correlation function of the GP model. Choices include `'G'` (default), `'PE'`, `'LBG'`, and `'LB'`. See the `references` for the details. For smooth (or analytic) functions, choose either `'G'` or `'LBG'`. Fitting is faster if `'G'` is chosen.
`Eps`	A vector containing the smallest eigen value(s) that the correlation matrix is allowed to have. The elements of Eps must be in [0, 1] and sorted in a descending order.
`AnaGr`	Flag indicating whether the gradient of the log-likelihood should be taken analytically (`!= 0`) or numerically (`== 0`). For now, only available when `CorrType == 'G'` or `CorrType == 'PE'`. If `AnaGr != 0`, the fitted model will generally be more accurate.
`Nopt`	The number of times the log-likelihood function is optimized when `Eps[1]` is used to constraint the smallest eigen value that the correlation matrix is allowed to have. Higher `Nopt` will increase fitting time as well as the chances of finding the global optimum. If `nrow(X)` is large (i.e., large training datasets), `Nopt` can be small.Analyzing the optimization results for `Eps[1]` and when `Progress != 0` will determine if `Nopt` has been large enough.
`TraceIt`	Non-negative integer. If positive, tracing information on the progress of the optimization is printed. There are six levels of tracing (see `optim`) and higher values will produce more tracing information.
`MaxIter`	Maximum number of iterations allowed for each optimization (see `optim`).
`Seed`	An integer for the random number generator. Use this to make the results reproducible.
`LowerBound`, `UpperBound`	To estimate the scale (aka roughness) parameters of the correlation function, the feasible range should be defined. `LowerBound` and `UpperBound` are vectors determining, resepectively, the lower and upper bounds. Their length depends on the parametric form of the correlation function (see `reference 1` for the details).
`StopFlag`	Flag indicating whether the optimization must be stopped if the negative log-likelihood increases with decreasing `Eps[i]`.
`Progress`	Flag indicating if the fitting process should be summarized. Set it to `!=0` to turn it on.
`DoParallel`	If `!= 0`, optimizations will be done in parallel.
`Ncores`	Number of cores to use if `DoParallel != 0`. The default is the maximum number of physical cores.

Value

Model A list containing the following components:

CovFunc A list containing the type and estimated parameters of the correlation function.
Data A list storing the original (but scaled) data.
Details A list of some parameters (used in prediction) as well as some values reporting the total run-time (cost) and the added nugget (Nug_opt) for satisfying the constraint on the smallest eigen value of the correlation matrix.
OptimHist The optimization history.
Setting The default/provided settings for running the code.

References

Bostanabad, R., Kearney, T., Tao, S., Apley, D. W. & Chen, W. (2018) Leveraging the nugget parameter for efficient Gaussian process modeling. Int J Numer Meth Eng, 114, 501-516.
Plumlee, M. & Apley, D. W. (2017) Lifted Brownian kriging models. Technometrics, 59, 165-177.

Examples

# 1D example: Fit a model (with default settings) and evaluate the performance
# by computing the root mean squared error (RMSE) in prediction.
library(lhs)
X <- 5*maximinLHS(15, 1)
Y <- 2*sin(2*X) + log(X+1)
M <- Fit(X, Y)
XF <- matrix(seq(0, 5, length.out = 100), 100, 1)
YF <- Predict(XF, M)
RMSE <- sqrt(mean((YF$YF - (2*sin(2*XF) + log(XF+1)))^2))

## Not run: 
# 1D example: Fit a model, evaluate the performance, and plot the response
# along with 95% prediction interval
X <- 10*maximinLHS(10, 1) - 5
Y <- X*cos(X)
M <- Fit(X, Y)
XF <- matrix(seq(-5, 5, length.out = 500), 500, 1)
YF <- Predict(XF, M)
RMSE <- sqrt(mean((YF$YF - (XF*cos(XF)))^2))
Draw(M, 1, res = 20)

# 2D example: Fit a model, evaluate the performance, and plot the response
# surface along with 95% prediction interval
X <- 2*maximinLHS(10, 2) - 1
Y <- X[, 1]^2 + X[, 2]^2
M <- Fit(X, Y, CorrType = "PE")
XF <- 2*maximinLHS(100, 2) - 1
YF <- Predict(XF, M)
RMSE <- sqrt(mean((YF$YF - (XF[, 1]^2 + XF[, 2]^2))^2))
library(lattice)
Draw(M, c(1, 1), res = 15, PI95=1)

# 2D example: Plot the previous model wrt X1 in the [-2, 2]
# interval with X2=1
Draw(M, c(1, 0), LB = -2, UB = 2, res = 15, PI95=1)

# 3D example: Compare the performance of Gaussian ("G") and lifted Browninan
# with Gamma=1 ("LBG")
X <- 2*maximinLHS(50, 3) - 1
Y <- cos(X[, 1]^2) + 2*sin(X[, 2]^2) + X[, 3]^2
M_G <- Fit(X, Y)
M_LBG <- Fit(X, Y, CorrType = "LBG")
XF <- 2*maximinLHS(500, 3) - 1
YF_G <- Predict(XF, M_G)
YF_LBG <- Predict(XF, M_LBG)
RMSE_G <- sqrt(mean((YF_G$YF - (cos(XF[, 1]^2) + 2*sin(XF[, 2]^2) + XF[, 3]^2))^2))
RMSE_LBG <- sqrt(mean((YF_LBG$YF - (cos(XF[, 1]^2) + 2*sin(XF[, 2]^2) + XF[, 3]^2))^2))

# 3D example: Draw the response in 2D using the M_G model when X3=0
Draw(M_G, c(1, 1, 0), PI95 = 0, Values = 0, X1Label = 'Input 1', X2Label = 'Input 2')

# 3D example: 2D response
X <- 2*maximinLHS(50, 3) - 1
Y <- cbind(cos(X[, 1]^2) + 2*sin(X[, 2]^2) + X[, 3]^2, rowSums(X))
M <- Fit(X, Y)
Draw(M, c(0, 1, 1), Response_ID = 2, Values = 0.5)

# 2D example with noise
X <- 2*maximinLHS(100, 2) - 1
Y <- X[, 1]^2 + X[, 2]^2 + matrix(rnorm(nrow(X), 0, .5), nrow(X), 1)
M <- Fit(X, Y)
# Estimating the noise variance (should be close to 0.5^2)
M$Details$Nug_opt*M$CovFunc$Parameters$Sigma2*M$Data$Yrange^2

## End(Not run)

[Package GPM version 3.0.1 Index]

The Fitting Function of GPM Package