| Fit {GaSP} | R Documentation | 
Fit a GaSP model.
Description
Fit (train) a GaSP model.
Usage
Fit(
  x,
  y,
  reg_model,
  sp_model = NULL,
  cor_family = c("PowerExponential", "Matern"),
  cor_par = data.frame(0),
  random_error = c(FALSE, TRUE),
  sp_var = -1,
  error_var = -1,
  nugget = 1e-09,
  tries = 10,
  seed = 500,
  fit_objective = c("Likelihood", "Posterior"),
  theta_standardized_min = 0,
  theta_standardized_max = .Machine$double.xmax,
  alpha_min = 0,
  alpha_max = 1,
  derivatives_min = 0,
  derivatives_max = 3,
  log_obj_tol = 1e-05,
  log_obj_diff = 0,
  lambda_prior = 0.1,
  model_comparison = c("Objective", "CV")
)
Arguments
| x | A data frame containing the input (explanatory variable) training data. | 
| y | A vector or a data frame with one column containing the output (response) training data. | 
| reg_model | The regression model, specified as a formula, but note the left-hand side of the formula is unused; see example. | 
| sp_model | An optional stochastic process model, specified as a formula,
but note the left-hand side of the formula and the intercept are unused.
The default  | 
| cor_family | A character string specifying the (product, anisoptropic) correlation-function family: "PowerExponential" for the power-exponential family or "Matern" for the Matern family. | 
| cor_par | An optional data frame containing the correlation parameters
with one row per  | 
| random_error | A boolean for the presence or not of a random (measurement, white-noise) error term. | 
| sp_var,error_var | Starting values of the stochastic process and error variances
for the first try to optimize the objective (see Details);
valid (i.e., nonnegative) values will only be used if  | 
| nugget | For numerical stability the proportion of the total variance
due to random error is fixed at this value ( | 
| tries | Number of optimizations of the objective from different random starting points. | 
| seed | The random-number seed to generate starting points. | 
| fit_objective | The objective that  | 
| theta_standardized_min,theta_standardized_max | The minimum and maximum of the standardized  | 
| alpha_min,alpha_max | The minimum and maximum
of the  | 
| derivatives_min,derivatives_max | The minimum and maximum
of the  | 
| log_obj_tol | An absolute tolerance for terminating the optimization of the log of the objective. | 
| log_obj_diff | The critical value for the change in the log objective for informal tests during optimization of correlation parameters. No testing is done with the default of 0; a larger critical value such as 2 may give a more parsimonious model. | 
| lambda_prior | The rate parameter of an exponential prior
for each  | 
| model_comparison | The criterion used to select from multiple solutions
when  | 
Details
Fit numerically optimizes the profile objective function with respect to the correlation parameters; the mean and overall variance parameters are estimated in closed form given the correlation parameters.
A cor_par data frame supplied by the user is the starting point
for the first optimization try.
If random_error = TRUE,
then sp_var / (sp_var + error_var) is another
correlation parameter to be optimized;
sp_var and error_var values supplied by the user
will initialize this parameter for the first try.
Set random_error = TRUE to estimate the variance of the
random (measurement, white-noise) error;
a small nugget error variance is for numerical stability.
For term j in the stochastic-process model,
the estimate of \theta_j is constrained between
theta_standardized_min / r_j^2 and
theta_standardized_max / r_j^2,
where r_j is the range of term j.
Note that Fit returns unscaled estimates relating to the original, unscaled inputs.
Value
A GaSPModel object, which is a list with the following components:
| x | The data frame containing the input training data. | 
| y | The training output data, now as a vector. | 
| reg_model | The regression model, now in the form of a data frame. | 
| sp_model | The stochastic process model, now in the form of a data frame. | 
| cor_family | The correlation family. | 
| cor_par | A data frame for the estimated correlation parameters. | 
| random_error | The boolean for the presence or not of a random error term. | 
| sp_var | The estimated stochastic process variance. | 
| error_var | The estimated random error variance. | 
| beta | A data frame holding the estimated regression-model parameters. | 
| objective | The maximum value found for the objective function: the log likelihood (fit_objective = "Likelihood") or the log posterior (fit_objective = "Posterior"). | 
| cond_num | The condition number. | 
| CVRMSE | The leave-one-out cross-validation root mean squared error. | 
References
Sacks, J., Welch, W.J., Mitchell, T.J., and Wynn, H.P. (1989) "Design and Analysis of Computer Experiments", Statistical Science, 4, pp. 409-423, doi:10.1214/ss/1177012413.
Examples
x <- borehole$x
y <- borehole$y
borehole_fit <- Fit(
  reg_model = ~1, x = x, y = y, cor_family = "Matern",
  random_error = FALSE, nugget = 0, fit_objective = "Posterior"
)