GauPro_kernel_model {GauPro} | R Documentation |
Gaussian process model with kernel
Description
Class providing object with methods for fitting a GP model. Allows for different kernel and trend functions to be used. The object is an R6 object with many methods that can be called.
'gpkm()' is equivalent to 'GauPro_kernel_model$new()', but is easier to type and gives parameter autocomplete suggestions.
Format
R6Class
object.
Value
Object of R6Class
with methods for fitting GP model.
Methods
new(X, Z, corr="Gauss", verbose=0, separable=T, useC=F, useGrad=T, parallel=T, nug.est=T, ...)
-
This method is used to create object of this class with
X
andZ
as the data. update(Xnew=NULL, Znew=NULL, Xall=NULL, Zall=NULL, restarts = 0, param_update = T, nug.update = self$nug.est)
This method updates the model, adding new data if given, then running optimization again.
Public fields
X
Design matrix
Z
Responses
N
Number of data points
D
Dimension of data
nug.min
Minimum value of nugget
nug.max
Maximum value of the nugget.
nug.est
Should the nugget be estimated?
nug
Value of the nugget, is estimated unless told otherwise
param.est
Should the kernel parameters be estimated?
verbose
0 means nothing printed, 1 prints some, 2 prints most.
useGrad
Should grad be used?
useC
Should C code be used?
parallel
Should the code be run in parallel?
parallel_cores
How many cores are there? By default it detects.
kernel
The kernel to determine the correlations.
trend
The trend.
mu_hatX
Predicted trend value for each point in X.
s2_hat
Variance parameter estimate
K
Covariance matrix
Kchol
Cholesky factorization of K
Kinv
Inverse of K
Kinv_Z_minus_mu_hatX
K inverse times Z minus the predicted trend at X.
restarts
Number of optimization restarts to do when updating.
normalize
Should the inputs be normalized?
normalize_mean
If using normalize, the mean of each column.
normalize_sd
If using normalize, the standard deviation of each column.
optimizer
What algorithm should be used to optimize the parameters.
track_optim
Should it track the parameters evaluated while optimizing?
track_optim_inputs
If track_optim is TRUE, this will keep a list of parameters evaluated. View them with plot_track_optim.
track_optim_dev
If track_optim is TRUE, this will keep a vector of the deviance values calculated while optimizing parameters. View them with plot_track_optim.
formula
Formula
convert_formula_data
List for storing data to convert data using the formula
Methods
Public methods
Method new()
Create kernel_model object
Usage
GauPro_kernel_model$new( X, Z, kernel, trend, verbose = 0, useC = TRUE, useGrad = TRUE, parallel = FALSE, parallel_cores = "detect", nug = 1e-06, nug.min = 1e-08, nug.max = 100, nug.est = TRUE, param.est = TRUE, restarts = 0, normalize = FALSE, optimizer = "L-BFGS-B", track_optim = FALSE, formula, data, ... )
Arguments
X
Matrix whose rows are the input points
Z
Output points corresponding to X
kernel
The kernel to use. E.g., Gaussian$new().
trend
Trend to use. E.g., trend_constant$new().
verbose
Amount of stuff to print. 0 is little, 2 is a lot.
useC
Should C code be used when possible? Should be faster.
useGrad
Should the gradient be used?
parallel
Should code be run in parallel? Make optimization faster but uses more computer resources.
parallel_cores
When using parallel, how many cores should be used?
nug
Value for the nugget. The starting value if estimating it.
nug.min
Minimum allowable value for the nugget.
nug.max
Maximum allowable value for the nugget.
nug.est
Should the nugget be estimated?
param.est
Should the kernel parameters be estimated?
restarts
How many optimization restarts should be used when estimating parameters?
normalize
Should the data be normalized?
optimizer
What algorithm should be used to optimize the parameters.
track_optim
Should it track the parameters evaluated while optimizing?
formula
Formula for the data if giving in a data frame.
data
Data frame of data. Use in conjunction with formula.
...
Not used
Method fit()
Fit model
Usage
GauPro_kernel_model$fit(X, Z)
Arguments
X
Inputs
Z
Outputs
Method update_K_and_estimates()
Update covariance matrix and estimates
Usage
GauPro_kernel_model$update_K_and_estimates()
Method predict()
Predict for a matrix of points
Usage
GauPro_kernel_model$predict( XX, se.fit = F, covmat = F, split_speed = F, mean_dist = FALSE, return_df = TRUE )
Arguments
XX
points to predict at
se.fit
Should standard error be returned?
covmat
Should covariance matrix be returned?
split_speed
Should the matrix be split for faster predictions?
mean_dist
Should the error be for the distribution of the mean?
return_df
When returning se.fit, should it be returned in a data frame? Otherwise it will be a list, which is faster.
Method pred()
Predict for a matrix of points
Usage
GauPro_kernel_model$pred( XX, se.fit = F, covmat = F, split_speed = F, mean_dist = FALSE, return_df = TRUE )
Arguments
XX
points to predict at
se.fit
Should standard error be returned?
covmat
Should covariance matrix be returned?
split_speed
Should the matrix be split for faster predictions?
mean_dist
Should the error be for the distribution of the mean?
return_df
When returning se.fit, should it be returned in a data frame? Otherwise it will be a list, which is faster.
Method pred_one_matrix()
Predict for a matrix of points
Usage
GauPro_kernel_model$pred_one_matrix( XX, se.fit = F, covmat = F, return_df = FALSE, mean_dist = FALSE )
Arguments
XX
points to predict at
se.fit
Should standard error be returned?
covmat
Should covariance matrix be returned?
return_df
When returning se.fit, should it be returned in a data frame? Otherwise it will be a list, which is faster.
mean_dist
Should the error be for the distribution of the mean?
Method pred_mean()
Predict mean
Usage
GauPro_kernel_model$pred_mean(XX, kx.xx)
Arguments
XX
points to predict at
kx.xx
Covariance of X with XX
Method pred_meanC()
Predict mean using C
Usage
GauPro_kernel_model$pred_meanC(XX, kx.xx)
Arguments
XX
points to predict at
kx.xx
Covariance of X with XX
Method pred_var()
Predict variance
Usage
GauPro_kernel_model$pred_var(XX, kxx, kx.xx, covmat = F)
Arguments
XX
points to predict at
kxx
Covariance of XX with itself
kx.xx
Covariance of X with XX
covmat
Should the covariance matrix be returned?
Method pred_LOO()
leave one out predictions
Usage
GauPro_kernel_model$pred_LOO(se.fit = FALSE)
Arguments
se.fit
Should standard errors be included?
Method pred_var_after_adding_points()
Predict variance after adding points
Usage
GauPro_kernel_model$pred_var_after_adding_points(add_points, pred_points)
Arguments
add_points
Points to add
pred_points
Points to predict at
Method pred_var_after_adding_points_sep()
Predict variance reductions after adding each point separately
Usage
GauPro_kernel_model$pred_var_after_adding_points_sep(add_points, pred_points)
Arguments
add_points
Points to add
pred_points
Points to predict at
Method pred_var_reduction()
Predict variance reduction for a single point
Usage
GauPro_kernel_model$pred_var_reduction(add_point, pred_points)
Arguments
add_point
Point to add
pred_points
Points to predict at
Method pred_var_reductions()
Predict variance reductions
Usage
GauPro_kernel_model$pred_var_reductions(add_points, pred_points)
Arguments
add_points
Points to add
pred_points
Points to predict at
Method plot()
Plot the object
Usage
GauPro_kernel_model$plot(...)
Arguments
...
Parameters passed to cool1Dplot(), plot2D(), or plotmarginal()
Method cool1Dplot()
Make cool 1D plot
Usage
GauPro_kernel_model$cool1Dplot( n2 = 20, nn = 201, col2 = "green", xlab = "x", ylab = "y", xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, gg = TRUE )
Arguments
n2
Number of things to plot
nn
Number of things to plot
col2
color
xlab
x label
ylab
y label
xmin
xmin
xmax
xmax
ymin
ymin
ymax
ymax
gg
Should ggplot2 be used to make plot?
Method plot1D()
Make 1D plot
Usage
GauPro_kernel_model$plot1D( n2 = 20, nn = 201, col2 = 2, col3 = 3, xlab = "x", ylab = "y", xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, gg = TRUE )
Arguments
n2
Number of things to plot
nn
Number of things to plot
col2
Color of the prediction interval
col3
Color of the interval for the mean
xlab
x label
ylab
y label
xmin
xmin
xmax
xmax
ymin
ymin
ymax
ymax
gg
Should ggplot2 be used to make plot?
Method plot2D()
Make 2D plot
Usage
GauPro_kernel_model$plot2D(se = FALSE, mean = TRUE, horizontal = TRUE, n = 50)
Arguments
se
Should the standard error of prediction be plotted?
mean
Should the mean be plotted?
horizontal
If plotting mean and se, should they be next to each other?
n
Number of points along each dimension
Method plotmarginal()
Plot marginal. For each input, hold all others at a constant value and adjust it along it's range to see how the prediction changes.
Usage
GauPro_kernel_model$plotmarginal(npt = 5, ncol = NULL)
Arguments
npt
Number of lines to make. Each line represents changing a single variable while holding the others at the same values.
ncol
Number of columnsfor the plot
Method plotmarginalrandom()
Plot marginal prediction for random sample of inputs
Usage
GauPro_kernel_model$plotmarginalrandom(npt = 100, ncol = NULL)
Arguments
npt
Number of random points to evaluate
ncol
Number of columns in the plot
Method plotkernel()
Plot the kernel
Usage
GauPro_kernel_model$plotkernel(X = self$X)
Arguments
X
X matrix for kernel plot
Method plotLOO()
Plot leave one out predictions for design points
Usage
GauPro_kernel_model$plotLOO()
Method plot_track_optim()
If track_optim, this will plot the parameters in the order they were evaluated.
Usage
GauPro_kernel_model$plot_track_optim(minindex = NULL)
Arguments
minindex
Minimum index to plot.
Method loglikelihood()
Calculate loglikelihood of parameters
Usage
GauPro_kernel_model$loglikelihood(mu = self$mu_hatX, s2 = self$s2_hat)
Arguments
mu
Mean parameters
s2
Variance parameter
Method AIC()
AIC (Akaike information criterion)
Usage
GauPro_kernel_model$AIC()
Method get_optim_functions()
Get optimization functions
Usage
GauPro_kernel_model$get_optim_functions(param_update, nug.update)
Arguments
param_update
Should parameters be updated?
nug.update
Should nugget be updated?
Method param_optim_lower()
Lower bounds of parameters for optimization
Usage
GauPro_kernel_model$param_optim_lower(nug.update)
Arguments
nug.update
Is the nugget being updated?
Method param_optim_upper()
Upper bounds of parameters for optimization
Usage
GauPro_kernel_model$param_optim_upper(nug.update)
Arguments
nug.update
Is the nugget being updated?
Method param_optim_start()
Starting point for parameters for optimization
Usage
GauPro_kernel_model$param_optim_start(nug.update, jitter)
Arguments
nug.update
Is nugget being updated?
jitter
Should there be a jitter?
Method param_optim_start0()
Starting point for parameters for optimization
Usage
GauPro_kernel_model$param_optim_start0(nug.update, jitter)
Arguments
nug.update
Is nugget being updated?
jitter
Should there be a jitter?
Method param_optim_start_mat()
Get matrix for starting points of optimization
Usage
GauPro_kernel_model$param_optim_start_mat(restarts, nug.update, l)
Arguments
restarts
Number of restarts to use
nug.update
Is nugget being updated?
l
Not used
Method optim()
Optimize parameters
Usage
GauPro_kernel_model$optim( restarts = self$restarts, n0 = 5 * self$D, param_update = T, nug.update = self$nug.est, parallel = self$parallel, parallel_cores = self$parallel_cores )
Arguments
restarts
Number of restarts to do
n0
This many starting parameters are chosen and evaluated. The best ones are used as the starting points for optimization.
param_update
Should parameters be updated?
nug.update
Should nugget be updated?
parallel
Should restarts be done in parallel?
parallel_cores
If running parallel, how many cores should be used?
Method optimRestart()
Run a single optimization restart.
Usage
GauPro_kernel_model$optimRestart( start.par, start.par0, param_update, nug.update, optim.func, optim.grad, optim.fngr, lower, upper, jit = T, start.par.i )
Arguments
start.par
Starting parameters
start.par0
Starting parameters
param_update
Should parameters be updated?
nug.update
Should nugget be updated?
optim.func
Function to optimize.
optim.grad
Gradient of function to optimize.
optim.fngr
Function that returns the function value and its gradient.
lower
Lower bounds for optimization
upper
Upper bounds for optimization
jit
Is jitter being used?
start.par.i
Starting parameters for this restart
Method update()
Update the model. Should only give in (Xnew and Znew) or (Xall and Zall).
Usage
GauPro_kernel_model$update( Xnew = NULL, Znew = NULL, Xall = NULL, Zall = NULL, restarts = self$restarts, param_update = self$param.est, nug.update = self$nug.est, no_update = FALSE )
Arguments
Xnew
New X values to add.
Znew
New Z values to add.
Xall
All X values to be used. Will replace existing X.
Zall
All Z values to be used. Will replace existing Z.
restarts
Number of optimization restarts.
param_update
Are the parameters being updated?
nug.update
Is the nugget being updated?
no_update
Are no parameters being updated?
Method update_fast()
Fast update when adding new data.
Usage
GauPro_kernel_model$update_fast(Xnew = NULL, Znew = NULL)
Arguments
Xnew
New X values to add.
Znew
New Z values to add.
Method update_params()
Update the parameters.
Usage
GauPro_kernel_model$update_params(..., nug.update)
Arguments
...
Passed to optim.
nug.update
Is the nugget being updated?
Method update_data()
Update the data. Should only give in (Xnew and Znew) or (Xall and Zall).
Usage
GauPro_kernel_model$update_data( Xnew = NULL, Znew = NULL, Xall = NULL, Zall = NULL )
Arguments
Xnew
New X values to add.
Znew
New Z values to add.
Xall
All X values to be used. Will replace existing X.
Zall
All Z values to be used. Will replace existing Z.
Method update_corrparams()
Update correlation parameters. Not the nugget.
Usage
GauPro_kernel_model$update_corrparams(...)
Arguments
...
Passed to self$update()
Method update_nugget()
Update nugget Not the correlation parameters.
Usage
GauPro_kernel_model$update_nugget(...)
Arguments
...
Passed to self$update()
Method deviance()
Calculate the deviance.
Usage
GauPro_kernel_model$deviance( params = NULL, nug = self$nug, nuglog, trend_params = NULL )
Arguments
params
Kernel parameters
nug
Nugget
nuglog
Log of nugget. Only give in nug or nuglog.
trend_params
Parameters for the trend.
Method deviance_grad()
Calculate the gradient of the deviance.
Usage
GauPro_kernel_model$deviance_grad( params = NULL, kernel_update = TRUE, X = self$X, nug = self$nug, nug.update, nuglog, trend_params = NULL, trend_update = TRUE )
Arguments
params
Kernel parameters
kernel_update
Is the kernel being updated? If yes, it's part of the gradient.
X
Input matrix
nug
Nugget
nug.update
Is the nugget being updated? If yes, it's part of the gradient.
nuglog
Log of the nugget.
trend_params
Trend parameters
trend_update
Is the trend being updated? If yes, it's part of the gradient.
Method deviance_fngr()
Calculate the deviance along with its gradient.
Usage
GauPro_kernel_model$deviance_fngr( params = NULL, kernel_update = TRUE, X = self$X, nug = self$nug, nug.update, nuglog, trend_params = NULL, trend_update = TRUE )
Arguments
params
Kernel parameters
kernel_update
Is the kernel being updated? If yes, it's part of the gradient.
X
Input matrix
nug
Nugget
nug.update
Is the nugget being updated? If yes, it's part of the gradient.
nuglog
Log of the nugget.
trend_params
Trend parameters
trend_update
Is the trend being updated? If yes, it's part of the gradient.
Method grad()
Calculate gradient
Usage
GauPro_kernel_model$grad(XX, X = self$X, Z = self$Z)
Arguments
XX
points to calculate at
X
X points
Z
output points
Method grad_norm()
Calculate norm of gradient
Usage
GauPro_kernel_model$grad_norm(XX)
Arguments
XX
points to calculate at
Method grad_dist()
Calculate distribution of gradient
Usage
GauPro_kernel_model$grad_dist(XX)
Arguments
XX
points to calculate at
Method grad_sample()
Sample gradient at points
Usage
GauPro_kernel_model$grad_sample(XX, n)
Arguments
XX
points to calculate at
n
Number of samples
Method grad_norm2_mean()
Calculate mean of gradient norm squared
Usage
GauPro_kernel_model$grad_norm2_mean(XX)
Arguments
XX
points to calculate at
Method grad_norm2_dist()
Calculate distribution of gradient norm squared
Usage
GauPro_kernel_model$grad_norm2_dist(XX)
Arguments
XX
points to calculate at
Method grad_norm2_sample()
Get samples of squared norm of gradient
Usage
GauPro_kernel_model$grad_norm2_sample(XX, n)
Arguments
XX
points to sample at
n
Number of samples
Method hessian()
Calculate Hessian
Usage
GauPro_kernel_model$hessian(XX, as_array = FALSE)
Arguments
XX
Points to calculate Hessian at
as_array
Should result be an array?
Method gradpredvar()
Calculate gradient of the predictive variance
Usage
GauPro_kernel_model$gradpredvar(XX)
Arguments
XX
points to calculate at
Method sample()
Sample at rows of XX
Usage
GauPro_kernel_model$sample(XX, n = 1)
Arguments
XX
Input matrix
n
Number of samples
Method optimize_fn()
Optimize any function of the GP prediction over the valid input space. If there are inputs that should only be optimized over a discrete set of values, specify 'mopar' for all parameters. Factor inputs will be handled automatically.
Usage
GauPro_kernel_model$optimize_fn( fn = NULL, lower = apply(self$X, 2, min), upper = apply(self$X, 2, max), n0 = 100, minimize = FALSE, fn_args = NULL, gr = NULL, fngr = NULL, mopar = NULL, groupeval = FALSE )
Arguments
fn
Function to optimize
lower
Lower bounds to search within
upper
Upper bounds to search within
n0
Number of points to evaluate in initial stage
minimize
Are you trying to minimize the output?
fn_args
Arguments to pass to the function fn.
gr
Gradient of function to optimize.
fngr
Function that returns list with names elements "fn" for the function value and "gr" for the gradient. Useful when it is slow to evaluate and fn/gr would duplicate calculations if done separately.
mopar
List of parameters using mixopt
groupeval
Can a matrix of points be evaluated? Otherwise just a single point at a time.
Method EI()
Calculate expected improvement
Usage
GauPro_kernel_model$EI(x, minimize = FALSE, eps = 0, return_grad = FALSE, ...)
Arguments
x
Vector to calculate EI of, or matrix for whose rows it should be calculated
minimize
Are you trying to minimize the output?
eps
Exploration parameter
return_grad
Should the gradient be returned?
...
Additional args
Method maxEI()
Find the point that maximizes the expected improvement. If there are inputs that should only be optimized over a discrete set of values, specify 'mopar' for all parameters.
Usage
GauPro_kernel_model$maxEI( lower = apply(self$X, 2, min), upper = apply(self$X, 2, max), n0 = 100, minimize = FALSE, eps = 0, dontconvertback = FALSE, EItype = "corrected", mopar = NULL, usegrad = FALSE )
Arguments
lower
Lower bounds to search within
upper
Upper bounds to search within
n0
Number of points to evaluate in initial stage
minimize
Are you trying to minimize the output?
eps
Exploration parameter
dontconvertback
If data was given in with a formula, should it converted back to the original scale?
EItype
Type of EI to calculate. One of "EI", "Augmented", or "Corrected"
mopar
List of parameters using mixopt
usegrad
Should the gradient be used when optimizing? Can make it faster.
Method maxqEI()
Find the multiple points that maximize the expected improvement. Currently only implements the constant liar method.
Usage
GauPro_kernel_model$maxqEI( npoints, method = "pred", lower = apply(self$X, 2, min), upper = apply(self$X, 2, max), n0 = 100, minimize = FALSE, eps = 0, EItype = "corrected", dontconvertback = FALSE, mopar = NULL )
Arguments
npoints
Number of points to add
method
Method to use for setting the output value for the points chosen as a placeholder. Can be one of: "CL" for constant liar, which uses the best value seen yet; or "pred", which uses the predicted value, also called the Believer method in literature.
lower
Lower bounds to search within
upper
Upper bounds to search within
n0
Number of points to evaluate in initial stage
minimize
Are you trying to minimize the output?
eps
Exploration parameter
EItype
Type of EI to calculate. One of "EI", "Augmented", or "Corrected"
dontconvertback
If data was given in with a formula, should it converted back to the original scale?
mopar
List of parameters using mixopt
Method KG()
Calculate Knowledge Gradient
Usage
GauPro_kernel_model$KG(x, minimize = FALSE, eps = 0, current_extreme = NULL)
Arguments
x
Point to calculate at
minimize
Is the objective to minimize?
eps
Exploration parameter
current_extreme
Used for recursive solving
Method AugmentedEI()
Calculated Augmented EI
Usage
GauPro_kernel_model$AugmentedEI( x, minimize = FALSE, eps = 0, return_grad = F, ... )
Arguments
x
Vector to calculate EI of, or matrix for whose rows it should be calculated
minimize
Are you trying to minimize the output?
eps
Exploration parameter
return_grad
Should the gradient be returned?
...
Additional args
f
The reference max, user shouldn't change this.
Method CorrectedEI()
Calculated Augmented EI
Usage
GauPro_kernel_model$CorrectedEI( x, minimize = FALSE, eps = 0, return_grad = F, ... )
Arguments
x
Vector to calculate EI of, or matrix for whose rows it should be calculated
minimize
Are you trying to minimize the output?
eps
Exploration parameter
return_grad
Should the gradient be returned?
...
Additional args
Method importance()
Feature importance
Usage
GauPro_kernel_model$importance(plot = TRUE, print_bars = TRUE)
Arguments
plot
Should the plot be made?
print_bars
Should the importances be printed as bars?
Method print()
Print this object
Usage
GauPro_kernel_model$print()
Method summary()
Summary
Usage
GauPro_kernel_model$summary(...)
Arguments
...
Additional arguments
Method clone()
The objects of this class are cloneable with this method.
Usage
GauPro_kernel_model$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
References
https://scikit-learn.org/stable/modules/permutation_importance.html#id2
Examples
n <- 12
x <- matrix(seq(0,1,length.out = n), ncol=1)
y <- sin(2*pi*x) + rnorm(n,0,1e-1)
gp <- GauPro_kernel_model$new(X=x, Z=y, kernel="gauss")
gp$predict(.454)
gp$plot1D()
gp$cool1Dplot()
n <- 200
d <- 7
x <- matrix(runif(n*d), ncol=d)
f <- function(x) {x[1]*x[2] + cos(x[3]) + x[4]^2}
y <- apply(x, 1, f)
gp <- GauPro_kernel_model$new(X=x, Z=y, kernel=Gaussian)