ktd_cv2d {ktweedie}R Documentation

Cross validation for jointly tuning the regularization coefficient and kernel parameter in the Kernel Tweedie Model

Description

ktd_cv2d() performs 2-dimensional random search from user-specified ranges to determine the optimal pair of regularization coefficient and kernel parameter of the ktweedie model.

Usage

ktd_cv2d(
  x,
  y,
  kernfunc,
  lambda,
  sigma,
  ncoefs,
  nfolds = 5,
  rho = 1.5,
  loss = "LL",
  ...
)

Arguments

x

Covariate matrix.

y

Outcome vector (e.g. insurance cost).

kernfunc

Choice of kernel function. See dots for details on supported kernel functions.

lambda

A vector of length two indicating the lower and upper bound from which candidate regularization coefficient values are sampled uniformly on the log scale.

sigma

A vector of length two indicating the lower and upper bound from which candidate kernel parameter values are sampled uniformly on the log scale.

ncoefs

The number of candidate lambda and sigma pairs to be evaluated.

nfolds

Number of folds in cross-validation. Default is 5.

rho

The power parameter of the Tweedie model. Default is 1.5 and can take any real value between 1 and 2.

loss

Criterion used in cross-validation. "LL" for log likelihood, "RMSE" for root mean squared error, "MAD" for mean absolute difference. Default is "LL".

...

Optional arguments to be passed to ktd_estimate().

Details

ktd_cv2d() is a built-in wrapper for 2D random search for the regularization coefficient and kernel parameter. For kernel functions with greater than one parameters, ktd_cv2d() supports the tuning of the first one.

Value

A list of three items.

  1. LL or RMSE or MAD: a vector of validation error based on the user-specified loss, named by the corresponding lambda and sigma values;

  2. Best_lambda: the lambda value in the pair that generates the best loss;

  3. Best_sigma: the sigma value in the pair that generates the best loss.

See Also

ktd_cv, ktd_estimate, ktd_predict

Examples

### Cross-validation
# Provide the kernel function name (e.g. rbfdot) to the argument kernfunc,
# NOT the kernel function object, e.g. rbfdot(sigma = 1).
# Provide ranges where the candidate lambdas and sigmas are drawn from
# to the arguments lambda and sigma.
# The number of pairs of candidates to select from is specified by ncoefs.
( cv2d <- ktd_cv2d(x = dat$x, y = dat$y,
                   kernfunc = rbfdot,
                   lambda = c(1e-3, 1e0),
                   sigma = c(1e-3, 1e0),
                   ncoefs = 10) )
### Followed by fitting
fit <- ktd_estimate(x = dat$x, y = dat$y,
                    kern = rbfdot(sigma = cv2d$Best_sigma),
                    lam1 = cv2d$Best_lambda)

[Package ktweedie version 1.0.3 Index]