R: K-fold cross-validation for resmoothing bandwidth.

cv_resmooth {drape}

R Documentation

K-fold cross-validation for resmoothing bandwidth.

Description

Picks the largest resmoothing bandwidth achieving a cross-validation score within some specified tolerance of the original regression.

Usage

cv_resmooth(
  X,
  y,
  d = 1,
  regression,
  tol = 2,
  prefit = FALSE,
  foldid = NULL,
  bw = exp(seq(-5, 2, 0.2))/(2 * sqrt(3)) * stats::sd(X[, d]),
  nfolds = 5L,
  n_points = 101,
  sd_trim = 5
)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`d`	integer index of covariate to be smoothed along.
`regression`	If prefit = FALSE this is a function which takes input data of the form (X,y), and returns a prediction function. This prediction function itself accepts matrix input same width as X, and returns a vector of y-predictions, and optionally a vector of derivative predictions. If prefit = TRUE then this is a list of length nfolds with each entry containing a component "fit" consisting of a prediction function taking matrix input and returning a vector.
`tol`	vector of tolerances controlling the degree of permissible cross-validation error increase. Larger values lead to a larger amount of smoothing being selected.
`prefit`	boolean signifying if the regressions are already fit to the training data for each fold.
`foldid`	optional vector with components in 1:nfolds indicating the folds in which each observation fell. Overwrites nfolds.
`bw`	vector of bandwidths for the Gaussian resmoothing kernel.
`nfolds`	integer number of cross-validation folds.
`n_points`	integer number of gridpoints to be used for convolution.
`sd_trim`	float number of standard deviations at which to trim the Gaussian distribution.

Value

list. Vector "bw" of bandwidths used. Vectors "cv" of cross-validation scores and numeric "cv_unsm" for the cross-validation without any smoothing. Vector "bw_opt_inds" for the indices of the selected bandwidths under various tolerances. Vector "bw_opt" for the corresponding bandwidths.

Examples

X <- matrix(stats::rnorm(200), ncol=2)
y <- X[,1] + sin(X[,2]) + 0.5 * stats::rnorm(nrow(X))
reg <- function(X,y){
    df <- data.frame(y,X)
    colnames(df) <- c("y", "X1", "X2")
    lm1 <- stats::lm(y~X1+sin(X2), data=df)
    fit <- function(newX){
        newdf = data.frame(newX)
        colnames(newdf) <- c("X1", "X2")
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
cv_resmooth(X=X, y=y, d=2, regression=reg, tol = c(0.5, 1, 2))

[Package drape version 0.0.1 Index]