sk_fit {snapKrig} | R Documentation |
Fit a covariance model to an sk grid by maximum likelihood
Description
This uses stats::optim
to minimize the log-likelihood function for a grid dataset
g
over the space of unknown parameters for the covariance function specified in pars
.
If only one parameter is unknown, the function instead uses stats::optimize
.
Usage
sk_fit(
g,
pars = NULL,
X = NA,
iso = TRUE,
n_max = 1000,
quiet = FALSE,
lower = NULL,
initial = NULL,
upper = NULL,
log_scale = TRUE,
method = "L-BFGS-B",
control = list()
)
Arguments
g |
an sk grid (or list with entries 'gdim', 'gres', 'gval') |
pars |
covariance parameter list, with |
X |
numeric (or NA), matrix, or sk grid of linear predictors, passed to |
iso |
logical, indicating to constrain the y and x kernel parameters to be the same |
n_max |
integer, the maximum number of observations allowed |
quiet |
logical, indicating to suppress console output |
lower |
numeric vector, lower bounds for parameters |
initial |
numeric vector, initial values for parameters |
upper |
numeric vector, upper bounds for parameters |
log_scale |
logical, indicating to log-transform parameters for optimization |
method |
character, passed to |
control |
list, passed to |
Details
NA
entries in pars
are treated as unknown parameters and fitted by the
function, whereas non-NA
entries are treated as fixed parameters (and not fitted).
If none of the parameters in pars
are NA
, the function copies everything as initial
values, then treats all parameters as unknown. pars
can also be a character vector
defining a pair of correlograms (see ?sk_pars
) in which case all covariance parameters
are treated as unknown.
Bounds and initial values are set automatically using sk_bds
, unless they are otherwise
specified in arguments lower
, initial
, upper
. These should be vectors containing only
the unknown parameters, ie. they must exclude fixed parameters. Call
sk_update_pars(pars, iso=iso)
to get a template with the expected names and order.
All parameters in the covariance models supported by snapKrig
are strictly positive.
Optimization is (by default) done on the parameter log-scale, and users can select a
non-constrained method
if they wish (?stats::optim
). As the default method 'L-BFGS-B'
is the only one that accepts bounds (lower
, initial
, upper
are otherwise ignored)
method
is ignored when log_scale=FALSE
.
Note that the 'gxp' and 'mat' correlograms behave strangely with very small or very large shape parameters, so for them we recommended 'L-BFGS-B' only.
When there is only one unknown parameter, the function uses stats::optimize
instead of
stats::optim
. In this case all entries of control
with the exception of 'tol' are
ignored, as are bounds and initial values, and arguments to method
.
As a sanity check n_max
sets a maximum for the number of observed grid points. This
is to avoid accidental calls with very large datasets that would cause R to hang or crash.
Set n_max=Inf
(with caution) to bypass this check. Similarly the maximum number of
iterations is set to 1e3
but this can be changed by manually setting 'maxit' in
control
.
Value
A parameter list in the form returned by sk_pars
containing both fixed and
fitted parameters. The data-frame of bounds and initial values is also included in the
attribute 'bds'
See Also
sk sk_LL sk_nLL stats::optim stats::optimize
Other parameter managers:
sk_bds()
,
sk_kp()
,
sk_pars_make()
,
sk_pars_update()
,
sk_pars()
,
sk_to_string()
Examples
# define a grid
gdim = c(50, 51)
g_empty = sk(gdim)
pars_src = sk_pars(g_empty)
pars_src = utils::modifyList(pars_src, list(eps=runif(1, 0, 1e1), psill=runif(1, 0, 1e2)))
pars_src[['y']][['kp']] = pars_src[['x']][['kp']] = runif(1, 1, 50)
# generate example data
g_obs = sk_sim(g_empty, pars_src)
sk_plot(g_obs)
# fit (set maxit low to keep check time short)
fit_result = sk_fit(g_obs, pars='gau', control=list(maxit=25), quiet=TRUE)
# compare estimates with true values
rbind(true=sk_pars_update(pars_src), fitted=sk_pars_update(fit_result))
# extract bounds data frame
attr(fit_result, 'bds')
# non-essential examples skipped to stay below 5s exec time on slower machines
# check a sequence of other psill values
pars_out = fit_result
psill_test = ( 2^(seq(5) - 3) ) * pars_out[['psill']]
LL_test = sapply(psill_test, function(s) sk_LL(utils::modifyList(pars_out, list(psill=s)), g_obs) )
plot(psill_test, LL_test)
lines(psill_test, LL_test)
print(data.frame(psill=psill_test, likelihood=LL_test))
# repeat with most data missing
n = prod(gdim)
n_obs = 25
g_obs = sk_sim(g_empty, pars_src)
idx_miss = sample.int(length(g_empty), length(g_empty) - n_obs)
g_miss = g_obs
g_miss[idx_miss] = NA
sk_plot(g_miss)
# fit (set maxit low to keep check time short) and compare
fit_result = sk_fit(g_miss, pars='gau', control=list(maxit=25), quiet=TRUE)
rbind(true=sk_pars_update(pars_src), fitted=sk_pars_update(fit_result))