gp {dgpsi}  R Documentation 
Gaussian process emulator construction
Description
This function builds and trains a GP emulator.
Usage
gp(
X,
Y,
struc = NULL,
name = "sexp",
lengthscale = rep(0.2, ncol(X)),
nugget_est = FALSE,
nugget = 1e06,
training = TRUE,
verb = TRUE,
internal_input_idx = NULL,
linked_idx = NULL
)
Arguments
X 
a matrix where each row is an input data point and each column is an input dimension.

Y 
a matrix with only one column and each row being an output data point.

struc 
an object produced by kernel() that gives a userdefined GP specifications. When struc = NULL ,
the GP specifications are automatically generated using information provided in name , lengthscale ,
nugget_est , nugget , and internal_input_idx . Defaults to NULL .

name 
kernel function to be used. Either "sexp" for squared exponential kernel or
"matern2.5" for MatÃ©rn2.5 kernel. Defaults to "sexp" . This argument is only used when struc = NULL .

lengthscale 
initial values of lengthscales in the kernel function. It can be a single numeric value or a vector:
if it is a single numeric value, it is assumed that kernel functions across input dimensions share the same lengthscale;
if it is a vector (which must have a length of ncol(X) ), it is assumed that kernel functions across input dimensions have different lengthscales.
Defaults to a vector of 0.2. This argument is only used when struc = NULL .

nugget_est 
a bool indicating if the nugget term is to be estimated:

FALSE : the nugget term is fixed to nugget .

TRUE : the nugget term will be estimated.
Defaults to FALSE . This argument is only used when struc = NULL .

nugget 
the initial nugget value. If nugget_est = FALSE , the assigned value is fixed during the training.
Set nugget to a small value (e.g., 1e6 ) and the corresponding bool in nugget_est to FASLE for deterministic emulations where the emulator
interpolates the training data points. Set nugget to a reasonable larger value and the corresponding bool in nugget_est to TRUE for stochastic
emulations where the computer model outputs are assumed to follow a homogeneous Gaussian distribution. Defaults to 1e6 . This argument is only used
when struc = NULL .

training 
a bool indicating if the initialized GP emulator will be trained.
When set to FALSE , gp() returns an untrained GP emulator, to which one can apply summary() to inspect its specifications
(especially when a customized struc is provided) or apply predict() to check its emulation performance before the training. Defaults to TRUE .

verb 
a bool indicating if the trace information on GP emulator construction and training will be printed during the function execution.
Defaults to TRUE .

internal_input_idx 
the column indices of X that are generated by the linked emulators in the preceding layers.
Set internal_input_idx = NULL if the GP emulator is in the first layer of a system or all columns in X are
generated by the linked emulators in the preceding layers. Defaults to NULL . This argument is only used when struc = NULL .

linked_idx 
either a vector or a list of vectors:
If linked_idx is a vector, it gives indices of columns in the pooled output matrix (formed by columncombined outputs of all
emulators in the feeding layer) that feed into the GP emulator. If the GP emulator is in the first layer of a linked emulator system,
the vector gives the column indices of the global input (formed by columncombining all input matrices of emulators in the first layer)
that the GP emulator will use. The length of the vector shall equal to the length of internal_input_idx when internal_input_idx is not NULL .
When the GP emulator is not in the first layer of a linked emulator system, linked_idx can be a list that gives the information on connections
between the GP emulator and emulators in all preceding layers. The length of the list should equal to the number of layers before
the GP emulator. Each element of the list is a vector that gives indices of columns in the pooled output matrix (formed by columncombined outputs
of all emulators) in the corresponding layer that feed into the GP emulator. If the GP emulator has no connections to any emulator in a certain layer,
set NULL in the corresponding position of the list. The order of input dimensions in X[,internal_input_idx] should be consistent with linked_idx .
For example, a GP emulator in the second layer that is fed by the output dimension 1 and 3 of emulators in layer 1 should have linked_idx = list( c(1,3) ) .
In addition, the first and second columns of X[,internal_input_idx] should correspond to the output dimensions 1 and 3 from layer 1.
Set linked_idx = NULL if the GP emulator will not be used for linked emulations. However, if this is no longer the case, one can use set_linked_idx()
to add linking information to the GP emulator. Defaults to NULL .

Details
See further examples and tutorials at https://mingdeyu.github.io/dgpsiR/.
Value
An S3 class named gp
that contains three slots:

constructor_obj
: a 'python' object that stores the information of the constructed GP emulator.

container_obj
: a 'python' object that stores the information for the linked emulation.

emulator_obj
: a 'python' object that stores the information for the predictions from the GP emulator.
The returned gp
object can be used by
Note
Any R vector detected in X
and Y
will be treated as a column vector and automatically converted into a singlecolumn
R matrix.
Examples
## Not run:
# load the package and the Python env
library(dgpsi)
init_py()
# construct a step function
f < function(x) {
if (x < 0.5) return(1)
if (x >= 0.5) return(1)
}
# generate training data
X < seq(0, 1, length = 10)
Y < sapply(X, f)
# training
m < gp(X, Y)
# summarizing
summary(m)
# LOO cross validation
m < validate(m)
plot(m)
# prediction
test_x < seq(0, 1, length = 200)
m < predict(m, x = test_x)
# OOS validation
validate_x < sample(test_x, 10)
validate_y < sapply(validate_x, f)
plot(m, validate_x, validate_y)
# write and read the constructed emulator
write(m, 'step_gp')
m < read('step_gp')
## End(Not run)
[Package
dgpsi version 2.1.5
Index]