| mle-methods {kergp} | R Documentation |
Maximum Likelihood Estimation of Gaussian Process Model Parameters
Description
Maximum Likelihood estimation of Gaussian Process models which covariance structure is described in a covariance kernel object.
Usage
## S4 method for signature 'covAll'
mle(object,
y, X, F = NULL, beta = NULL,
parCovIni = coef(object),
parCovLower = coefLower(object),
parCovUpper = coefUpper(object),
noise = TRUE, varNoiseIni = var(y) / 10,
varNoiseLower = 0, varNoiseUpper = Inf,
compGrad = hasGrad(object),
doOptim = TRUE,
optimFun = c("nloptr::nloptr", "stats::optim"),
optimMethod = ifelse(compGrad, "NLOPT_LD_LBFGS", "NLOPT_LN_COBYLA"),
optimCode = NULL,
multistart = 1,
parTrack = FALSE, trace = 0, checkNames = TRUE,
...)
Arguments
object |
An object representing a covariance kernel. |
y |
Response vector. |
X |
Spatial (or input) design matrix. |
F |
Trend matrix. |
beta |
Vector of trend coefficients if known/fixed. |
parCovIni |
Vector with named elements or matrix giving the initial values for the
parameters. See Examples. When this argument is omitted, the
vector of covariance parameters given in |
parCovLower |
Lower bounds for the parameters. When this argument is omitted, the
vector of parameters lower bounds in the covariance given in
|
parCovUpper |
Upper bounds for the parameters. When this argument is omitted, the
vector of parameters lower bounds in the covariance given in
|
noise |
Logical. Should a noise be added to the error term? |
varNoiseIni |
Initial value for the noise variance. |
varNoiseLower |
Lower bound for the noise variance. Should be |
varNoiseUpper |
Upper bound for the noise variance. Should be |
compGrad |
Logical: compute and use the analytical gradient in optimization?
This is only possible when |
doOptim |
Logical. If |
optimFun |
Function used for optimization. The two pre-defined choices are
|
optimMethod |
Name of the optimization method or algorithm. This is passed as the
|
optimCode |
An object with class |
multistart |
Number of optimizations to perform from different starting points
(see |
parTrack |
If |
trace |
Integer level of verbosity. |
checkNames |
if |
... |
Further arguments to be passed to the optimization function,
|
Details
The choice of optimization method is as follows.
When
optimFunisnloptr:nloptr, it is assumed that we are minimizing the negative log-likelihood- \log L. Note that both predefined methods"NLOPT_LD_LBFGS"and"NLOPT_LN_COBYLA"can cope with a non-finite value of the objective, except for the initial value of the parameter. Non-finite values of- \log Lare often met in practice during optimization steps. The method"NLOPT_LD_LBFGS"used whencompGradisTRUErequires that the gradient is provided by/with the covariance object. You can try other values ofoptimMethodcorresponding to the possible choice of the"algorithm"element in theoptsargument ofnloptr:nloptr. It may be useful to give other options in order to control the optimization and its stopping rule.When
optimFunis"stats:optim", it is assumed that we are maximizing the log-likelihood\log L. We suggest to use one of the methods"L-BFGS-B"or"BFGS". Notice thatcontrolcan be provided in..., butcontrol$fnscaleis forced to be- 1, corresponding to maximization. Note that"L-BFGS-B"uses box constraints, but the optimization stops as soon as the log-likelihood is non-finite orNA. The method"BFGS"does not use constraints but allows the log-likelihood to be non-finite orNA. Both methods can be used without gradient or with gradient ifobjectallows this.
The vectors parCovIni, parCovLower, parCovUpper
must have elements corresponding to those of the vector of kernel
parameters given by coef(object). These vectors should have
suitably named elements.
Value
A list with elements hopefully having understandable names.
opt |
List of optimization results if it was successful, or an error object otherwise. |
coef.kernel |
The vector of 'kernel' coefficients. This will include one or several variance parameters. |
coef.trend |
Estimate of the vector |
parTracked |
A matrix with rows giving the successive iterates during
optimization if the |
Note
The checks concerning the parameter names, dimensions of provided objects, ... are not fully implemented yet.
Using the optimCode possibility requires a bit of programming
effort, although a typical code only contains a few lines.
Author(s)
Y. Deville, O. Roustant
See Also
gp for various examples, optimMethods
to see the possible values of the argument optimMethod.
Examples
set.seed(29770)
##=======================================================================
## Example. A 4-dimensional "covMan" kernel
##=======================================================================
d <- 4
myCovMan <-
covMan(
kernel = function(x1, x2, par) {
htilde <- (x1 - x2) / par[1]
SS2 <- sum(htilde^2)
d2 <- exp(-SS2)
kern <- par[2] * d2
d1 <- 2 * kern * SS2 / par[1]
attr(kern, "gradient") <- c(theta = d1, sigma2 = d2)
return(kern)
},
label = "myGauss",
hasGrad = TRUE,
d = 4,
parLower = c(theta = 0.0, sigma2 = 0.0),
parUpper = c(theta = +Inf, sigma2 = +Inf),
parNames = c("theta", "sigma2"),
par = c(NA, NA)
)
kernCode <- "
SEXP kern, dkern;
int nprotect = 0, d;
double SS2 = 0.0, d2, z, *rkern, *rdkern;
d = LENGTH(x1);
PROTECT(kern = allocVector(REALSXP, 1)); nprotect++;
PROTECT(dkern = allocVector(REALSXP, 2)); nprotect++;
rkern = REAL(kern);
rdkern = REAL(dkern);
for (int i = 0; i < d; i++) {
z = ( REAL(x1)[i] - REAL(x2)[i] ) / REAL(par)[0];
SS2 += z * z;
}
d2 = exp(-SS2);
rkern[0] = REAL(par)[1] * d2;
rdkern[1] = d2;
rdkern[0] = 2 * rkern[0] * SS2 / REAL(par)[0];
SET_ATTR(kern, install(\"gradient\"), dkern);
UNPROTECT(nprotect);
return kern;
"
## inline the C function into an R function: MUCH MORE EFFICIENT!!!
## Not run:
require(inline)
kernC <- cfunction(sig = signature(x1 = "numeric", x2 = "numeric",
par = "numeric"),
body = kernCode)
myCovMan <- covMan(kernel = kernC, hasGrad = TRUE, label = "myGauss", d = 4,
parNames = c("theta", "sigma2"),
parLower = c("theta" = 0.0, "sigma2" = 0.0),
parUpper = c("theta" = Inf, "sigma2" = Inf))
## End(Not run)
##=======================================================================
## Example (continued). Simulate data for covMan and trend
##=======================================================================
n <- 100;
X <- matrix(runif(n * d), nrow = n)
colnames(X) <- inputNames(myCovMan)
coef(myCovMan) <- myPar <- c(theta = 0.5, sigma2 = 2)
C <- covMat(object = myCovMan, X = X,
compGrad = FALSE, index = 1L)
library(MASS)
set.seed(456)
y <- mvrnorm(mu = rep(0, n), Sigma = C)
p <- rpois(1, lambda = 4)
if (p > 0) {
F <- matrix(runif(n * p), nrow = n, ncol = p)
beta <- rnorm(p)
y <- F %*% beta + y
} else F <- NULL
par <- parCovIni <- c("theta" = 0.6, "sigma2" = 4)
##=======================================================================
## Example (continued). ML estimation. Note the 'partrack' argument
##=======================================================================
est <- mle(object = myCovMan,
parCovIni = parCovIni,
y = y, X = X, F = F,
parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est$opt$value
## change the (constrained) optimization method
## Not run:
est1 <- mle(object = myCovMan,
parCovIni = parCovIni,
optimFun = "stats::optim",
optimMethod = "L-BFGS-B",
y = y, X = X, F = F,
parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est1$opt$value
## End(Not run)
##=======================================================================
## Example (continued). Grid for graphical analysis
##=======================================================================
## Not run:
theta.grid <- seq(from = 0.1, to = 0.7, by = 0.2)
sigma2.grid <- seq(from = 0.3, to = 6, by = 0.4)
par.grid <- expand.grid(theta = theta.grid, sigma2 = sigma2.grid)
ll <- apply(as.matrix(par.grid), 1, est$logLikFun)
llmat <- matrix(ll, nrow = length(theta.grid),
ncol = length(sigma2.grid))
## End(Not run)
##=======================================================================
## Example (continued). Explore the surface ?
##=======================================================================
## Not run:
require(rgl)
persp3d(x = theta.grid, y = sigma2.grid, z = ll,
xlab = "theta", ylab = "sigma2", zlab = "logLik",
col = "SpringGreen3", alpha = 0.6)
## End(Not run)
##=======================================================================
## Example (continued). Draw a contour plot for the log-lik
## and show iterates
##=======================================================================
## Not run:
contour(x = theta.grid, y = sigma2.grid, z = llmat,
col = "SpringGreen3", xlab = "theta", ylab = "sigma2",
main = "log-likelihood contours and iterates",
xlim = range(theta.grid, est$parTracked[ , 1], na.rm = TRUE),
ylim = range(sigma2.grid, est$parTracked[ , 2], na.rm = TRUE))
abline(v = est$coef.kernel[1], h = est$coef.kernel[2], lty = "dotted")
niter <- nrow(est$parTracked)
points(est$parTracked[1:niter-1, ],
col = "orangered", bg = "yellow", pch = 21, lwd = 2, type = "o")
points(est$parTracked[niter, , drop = FALSE],
col = "blue", bg = "blue", pch = 21, lwd = 2, type = "o", cex = 1.5)
ann <- seq(from = 1, to = niter, by = 5)
text(x = est$parTracked[ann, 1], y = est$parTracked[ann, 2],
labels = ann - 1L, pos = 4, cex = 0.8, col = "orangered")
points(x = myPar["theta"], y = myPar["sigma2"],
bg = "Chartreuse3", col = "ForestGreen",
pch = 22, lwd = 2, cex = 1.4)
legend("topright", legend = c("optim", "optim (last)", "true"),
pch = c(21, 21, 22), lwd = c(2, 2, 2), lty = c(1, 1, NA),
col = c("orangered", "blue", "ForestGreen"),
pt.bg = c("yellow", "blue", "Chartreuse3"))
## End(Not run)