R: Fast cross-validation for multi-penalty ridge regression

multiridge-package {multiridge}

R Documentation

Fast cross-validation for multi-penalty ridge regression

Description

The package implements multi-penalty linear, logistic and cox ridge regression, including estimation of the penalty parameters by efficient (repeated) cross-validation or marginal likelihood maximization. It allows for multiple high-dimensional data types that require penalization, as well as unpenalized variables. Moreover, it allows a paired penalty for paired data types, and preferential data types can be specified.

Details

The DESCRIPTION file:

Package:	multiridge
Type:	Package
Title:	Fast Cross-Validation for Multi-Penalty Ridge Regression
Version:	1.11
Date:	2022-06-13
Author:	Mark A. van de Wiel
Maintainer:	Mark A. van de Wiel <mark.vdwiel@amsterdamumc.nl>
Depends:	R (>= 3.5.0), survival, pROC, methods, mgcv, snowfall
Description:	Multi-penalty linear, logistic and cox ridge regression, including estimation of the penalty parameters by efficient (repeated) cross-validation and marginal likelihood maximization. Multiple high-dimensional data types that require penalization are allowed, as well as unpenalized variables. Paired and preferential data types can be specified. See Van de Wiel et al. (2021), <arXiv:2005.09301>.
License:	GPL (>=3)

Index of help topics:

CVfolds                 Creates (repeated) cross-validation folds
CVscore                 Cross-validated score
IWLSCoxridge            Iterative weighted least squares algorithm for
                        Cox ridge regression.
IWLSridge               Iterative weighted least squares algorithm for
                        linear and logistic ridge regression.
Scoring                 Evaluate predictions
SigmaFromBlocks         Create penalized sample cross-product matrix
augment                 Augment data with zeros.
betasout                Coefficient estimates from (converged) IWLS fit
createXXblocks          Creates list of (unscaled) sample covariance
                        matrices
createXblocks           Create list of paired data blocks
dataXXmirmeth           Contains R-object 'dataXXmirmeth'
doubleCV                Double cross-validation for estimating
                        performance of 'multiridge'
fastCV2                 Fast cross-validation per data block
mgcv_lambda             Maximum marginal likelihood score
mlikCV                  Outer-loop cross-validation for estimating
                        performance of marginal likelihood based
                        'multiridge'
multiridge-package      Fast cross-validation for multi-penalty ridge
                        regression
optLambdas              Find optimal ridge penalties.
optLambdasWrap          Find optimal ridge penalties with sequential
                        optimization.
optLambdas_mgcv         Find optimal ridge penalties with maximimum
                        marginal likelihood
optLambdas_mgcvWrap     Find optimal ridge penalties with sequential
                        optimization.
predictIWLS             Predictions from ridge fits
setupParallel           Setting up parallel computing

betasout: Coefficient estimates from (converged) IWLS fit
createXXblocks: Creates list of (unscaled) sample covariance matrices
CVscore: Cross-validated score for given penalty parameters
dataXXmirmeth: Example data
doubleCV: Double cross-validation for estimating performance
fastCV2: Fast cross-validation per data block; no dependency
IWLSCoxridge: Iterative weighted least squares algorithm for Cox ridge regression
IWLSridge: Iterative weighted least squares algorithm for linear and logistic ridge regression
mlikCV: Cross-validation for estimating performance of marginal likelihood estimation
optLambdasWrap: Find optimal ridge penalties by cross-validation
optLambdas_mgcvWrap: Find optimal ridge penalties in terms of marginal likelihood
predictIWLS: Predictions from ridge fits
setupParallel: Setting up parallel computing
SigmaFromBlocks: Create penalized sample cross-product matrix

Author(s)

Mark A. van de Wiel (mark.vdwiel@amsterdamumc.nl)

References

Mark A. van de Wiel, Mirrelijn van Nee, Armin Rauschenberger (2021). Fast cross-validation for high-dimensional ridge regression. J Comp Graph Stat

Examples

data(dataXXmirmeth)
resp <- dataXXmirmeth[[1]]
XXmirmeth <- dataXXmirmeth[[2]]

# Find initial lambdas: fast CV per data block separately.
cvperblock2 <- fastCV2(XXblocks=XXmirmeth,Y=resp,kfold=10,fixedfolds = TRUE)
lambdas <- cvperblock2$lambdas

# Create (repeated) CV-splits of the data.
leftout <- CVfolds(Y=resp,kfold=10,nrepeat=3,fixedfolds = TRUE)

# Compute cross-validated score for initial lambdas
CVscore(penalties=lambdas, XXblocks=XXmirmeth,Y=resp,folds=leftout,
score="loglik")

# Optimizes cross-validate criterion (default: log-lik)
# Increase the number of iterations for optimal results
jointlambdas <- optLambdasWrap(penaltiesinit=lambdas, XXblocks=XXmirmeth,Y=resp,
folds=leftout,score="loglik",save=T, maxItropt1=5, maxItropt2=5)


# Alternatively: optimize by using marginal likelihood criterion
## Not run: 
jointlambdas2 <- optLambdas_mgcvWrap(penaltiesinit=lambdas, XXblocks=XXmirmeth,
Y=resp)

## End(Not run)

# Optimal lambdas
optlambdas <- jointlambdas$optpen

# Prepare fitting for the optimal lambdas.
XXT <- SigmaFromBlocks(XXmirmeth,penalties=optlambdas)

# Fit. fit$etas contains the n linear predictors
fit <- IWLSridge(XXT,Y=resp)