cvRepWtTuning {ClinicalUtilityRecal}R Documentation

Repeated Cross Validation for Weight Tuning Parameter Selection


Calibration weights require specification of tuning parameter deltadelta or lambdalambda. Since a single round of cross-validation can be noisy, cross-validation can be repeated multiple times with independent random partitions and the results be averaged. This function implements a repeated K-fold cross-validation where tuning parameter labmdalabmda or deltadelta is selected by maximizing standardized net benefit (sNB) (i.e. repeated cvWtTuning procedure).

A a "one-standard error" rule can be used for selecting tuning parameters. Under the “one-standard error" rule the calibration weight tuning parameter (lambdalambda or deltadelta) is selected such that corresponding cross-validated sNB is within one-standard deviation of the maximum cross-validated sNB. This provides protection against overfitting the data and selecting a tuning parameter that is too extreme. If the "one-standard error" rule is not implemented, then the tuning parameter with the larged average cross-validted sNB (across folds and repetition) will be selected.





Vector of binary outcomes, with 1 indicating event (cases) and 0 indicating no event (controls)


Vector of risk score values


Clinically relevant risk threshold


Lower bound of clinically relevant region


Upper bound of clinically relevant region


Number of folds for cross-validation


Number of cross-validation repititions


Parameter to be selected via cross-validation. Can be either deltadelta the weight assigned to observations outside the clinically relevant region [R_l,R_u], or the lambdalambda tuning parameter controlling exponential decay within the clinically relevant region [R_l,R_u]


Sequence of values of tuning parameters to perform cross-validation over


Use "one-standard" error rule selecting tuning parameter


Intial seed set for random splitting of data into K folds


To estimate the standard deviation of the cross-validated sNV, the dependence between the different partitions of cross-validation needs to be accounted for. Gelman (1992) give a variance estimator of convergence diagnostic statistic used when Markov Chain Monte Carlo with multiple chains are performed. The variance estimator accounts for both the variability of the statistic “within" a single chain, and the variance of the statistic across, or “between", chains. Analogously, we can use this framework to estimate the “within" repetition variance (i.e. variation in sNB from a single round of K-fold cross-validation) and the “between" repetition variance. We denote the ‘within" repetition variance as W and the “between" repetition variance as B . We augment this formula slightly from that given in Gelman (1992) to account for the fact that as the number of cross-validation repetitions increases, the between-repetition variability should decrease. See Mishra et al (2020) for full expressions of B and W.



Standardized net benefit (sNB) of tuning parameter selected via cross-validatoin


Corresponding RAW value given cross-valiated selected tuning parameter


lambdalambda value selected via cross-validation if cvParm=lambdacvParm=lambda, otherwise user specified lambdalambda value

deltadelta value selected via cross-validation if cvParm=deltacvParm=delta, otherwise user specified lambdalambda value


Averaged (across-replications) cross-validated sNB for sequence of tuning parameters


Estimate of "with-in" repetition variance. Will only return if stdErrRule==TRUE


Estimate of "between" repetition variance. Will only return if stdErrRule==TRUE


List of cross-valiation results for all fold and repititions


Anu Mishra


Mishra, A. (2019). Methods for Risk Markers that Incorporate Clinical Utility (Doctoral dissertation). (Available Upon Request)

Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.

Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical science, 7(4), 457-472.

See Also

calWt, RAWgrid, nb, cvWtTuning


### Load data ##
## Not run: 

### Get grid of tuning parameters  ###
grid <- RAWgrid(r = 0.3,rl = -Inf,ru = Inf,p = fakeData$p,y = fakeData$y,
                cvParm = "lambda",rl.raw = 0.25,ru.raw = 0.35)

### Implement repeated k-fold cross validation
repCV <- cvRepWtTuning(y = fakeData$y,p = fakeData$p,rl = -Inf,ru = Inf,r = 0.3,
                       kFold = 5,cvRep = 25,cvParm = "lambda",tuneSeq = grid,stdErrRule = TRUE)

## cross-validation results

## cross-validation selected lambda, RAW, and sNV
cv.lambda <- repCV$cv.lambda
cv.RAW <- repCV$cv.RAW
cv.RAW <- repCV$cv.sNB

## End(Not run)

[Package ClinicalUtilityRecal version 0.1.0 Index]