CF_crossval {ZVCV} | R Documentation |
Control functionals (CF) with cross-validation
Description
This function chooses between a list of kernel tuning parameters (sigma_list
) or a list of K0 matrices (K0_list
) for
the control functionals method described in Oates et al (2017). The latter requires
calculating and storing kernel matrices using K0_fn
but it is more flexible
because it can be used to choose the Stein operator order and the kernel function, in addition
to its parameters. It is also faster to pre-specify K0_fn
.
For estimation with fixed kernel parameters, use CF
.
Usage
CF_crossval(
integrands,
samples,
derivatives,
steinOrder = NULL,
kernel_function = NULL,
sigma_list = NULL,
K0_list = NULL,
est_inds = NULL,
log_weights = NULL,
one_in_denom = FALSE,
folds = NULL,
diagnostics = FALSE
)
Arguments
integrands |
An |
samples |
An |
derivatives |
An |
steinOrder |
(optional) This is the order of the Stein operator. The default is |
kernel_function |
(optional) Choose between "gaussian", "matern", "RQ", "product" or "prodsim". See below for further details. |
sigma_list |
(optional between this and |
K0_list |
(optional between this and |
est_inds |
(optional) A vector of indices for the estimation-only samples. The default when |
log_weights |
(optional) A vector of length |
one_in_denom |
(optional) Whether or not to include a |
folds |
(optional) The number of folds for cross-validation. The default is five. |
diagnostics |
(optional) A flag for whether to return the necessary outputs for plotting or estimating using the fitted model. The default is |
Value
A list with the following elements:
-
expectation
: The estimate(s) of the (k
) expectation(s). -
mse
: A matrix of the cross-validation mean square prediction errors. The number of columns is the number of tuning options given and the number of rows isk
, the number of integrands of interest. -
optinds
: The optimal indices from the list for each expectation. -
f_true
: (Only ifest_inds
is notNULL
) The integrands for the evaluation set. This should be the same as integrands[setdiff(1:N,est_inds),]. -
f_hat
: (Only ifest_inds
is notNULL
) The fitted values for the integrands in the evaluation set. This can be used to help assess the performance of the Gaussian process model. -
a
: (Only ifdiagnostics
=TRUE
) The value ofa
as described in South et al (2020), where predictions are of the formf_hat = K0*a + 1*b
for heldout K0 and estimators using heldout samples are of the formmean(f - f_hat) + b
. -
b
: (Only ifdiagnostics
=TRUE
) The value ofb
as described in South et al (2020), where predictions are of the formf_hat = K0*a + 1*b
for heldout K0 and estimators using heldout samples are of the formmean(f - f_hat) + b
. -
ksd
: (Only ifdiagnostics
=TRUE
) An estimated kernel Stein discrepancy based on the fitted model that can be used for diagnostic purposes. See South et al (2020) for further details. -
bound_const
: (Only ifdiagnostics
=TRUE
andest_inds
=NULL
) This is such that the absolute error for the estimator should be less thanksd \times bound_const
.
Warning
Solving the linear system in CF has O(N^3)
complexity and is therefore not suited to large N
. Using est_inds
will instead have an O(N_0^3)
cost in solving the linear system and an O((N-N_0)^2)
cost in handling the remaining samples, where N_0
is the length of est_inds
. This can be much cheaper for large N
.
On the choice of \sigma
, the kernel and the Stein order
The kernel in Stein-based kernel methods is L_x L_y k(x,y)
where L_x
is a first or second order Stein operator in x
and k(x,y)
is some generic kernel to be specified.
The Stein operators for distribution p(x)
are defined as:
-
steinOrder=1
:L_x g(x) = \nabla_x^T g(x) + \nabla_x \log p(x)^T g(x)
(see e.g. Oates el al (2017)) -
steinOrder=2
:L_x g(x) = \Delta_x g(x) + \nabla_x log p(x)^T \nabla_x g(x)
(see e.g. South el al (2020))
Here \nabla_x
is the first order derivative wrt x
and \Delta_x = \nabla_x^T \nabla_x
is the Laplacian operator.
The generic kernels which are implemented in this package are listed below. Note that the input parameter sigma
defines the kernel parameters \sigma
.
-
"gaussian"
: A Gaussian kernel,k(x,y) = exp(-z(x,y)/\sigma^2)
-
"matern"
: A Matern kernel with\sigma = (\lambda,\nu)
,k(x,y) = bc^{\nu}z(x,y)^{\nu/2}K_{\nu}(c z(x,y)^{0.5})
where
b=2^{1-\nu}(\Gamma(\nu))^{-1}
,c=(2\nu)^{0.5}\lambda^{-1}
andK_{\nu}(x)
is the modified Bessel function of the second kind. Note that\lambda
is the length-scale parameter and\nu
is the smoothness parameter (which defaults to 2.5 forsteinOrder=1
and 4.5 forsteinOrder=2
). -
"RQ"
: A rational quadratic kernel,k(x,y) = (1+\sigma^{-2}z(x,y))^{-1}
-
"product"
: The product kernel that appears in Oates et al (2017) with\sigma = (a,b)
k(x,y) = (1+a z(x) + a z(y))^{-1} exp(-0.5 b^{-2} z(x,y))
-
"prodsim"
: A slightly different product kernel with\sigma = (a,b)
(see e.g. https://www.imperial.ac.uk/inference-group/projects/monte-carlo-methods/control-functionals/),k(x,y) = (1+a z(x))^{-1}(1 + a z(y))^{-1} exp(-0.5 b^{-2} z(x,y))
In the above equations, z(x) = \sum_j x[j]^2
and z(x,y) = \sum_j (x[j] - y[j])^2
. For the last two kernels, the code only has implementations for steinOrder
=1
. Each combination of steinOrder
and kernel_function
above is currently hard-coded but it may be possible to extend this to other kernels in future versions using autodiff. The calculations for the first three kernels above are detailed in South et al (2020).
Author(s)
Leah F. South
References
Oates, C. J., Girolami, M. & Chopin, N. (2017). Control functionals for Monte Carlo integration. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3), 695-718.
South, L. F., Karvonen, T., Nemeth, C., Girolami, M. and Oates, C. J. (2020). Semi-Exact Control Functionals From Sard's Method. https://arxiv.org/abs/2002.00033
See Also
See ZVCV for examples and related functions. See CF
for a function to perform control functionals with fixed kernel specifications.