crispCV {crisp} | R Documentation |
CRISP with Tuning Parameter Selection via Cross-Validation.
Description
This function implements CRISP, which considers the problem of predicting an outcome variable on the basis of two covariates, using an interpretable yet non-additive model.
CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods,
CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. This function differs
from the crisp
function in that the tuning parameter, lambda, is automatically selected using K-fold cross-validation.
More details are provided in Petersen, A., Simon, N., and Witten, D. (2016). Convex Regression with Interpretable Sharp Partitions. Journal of Machine Learning Research, 17(94): 1-31 <http://jmlr.org/papers/volume17/15-344/15-344.pdf>.
Usage
crispCV(y, X, q = NULL, lambda.min.ratio = 0.01, n.lambda = 50,
lambda.seq = NULL, fold = NULL, n.fold = NULL, seed = NULL,
within1SE = FALSE, rho = 0.1, e_abs = 10^-4, e_rel = 10^-3,
varyrho = TRUE, double.run = FALSE)
Arguments
y |
An n-vector containing the response. |
X |
An n x 2 matrix with each column containing a covariate. |
q |
The desired granularity of the CRISP fit, |
lambda.min.ratio |
The smallest value for |
n.lambda |
The number of lambda values to consider - the default is 50. |
lambda.seq |
A user-supplied sequence of positive lambda values to consider. The typical usage is to calculate
|
fold |
User-supplied fold numbers for cross-validation. If supplied, |
n.fold |
The number of folds, K, to use for the K-fold cross-validation selection of the tuning parameter, lambda. The default is 10 - specification of |
seed |
An optional number used with |
within1SE |
Logical value indicating how cross-validated tuning parameters should be chosen. If |
rho |
The penalty parameter for our ADMM algorithm. The default is 0.1. |
e_abs , e_rel |
Values used in the stopping criterion for our ADMM algorithm, and discussed in Appendix C.2 of the CRISP paper. |
varyrho |
Should |
double.run |
The initial complete run of our ADMM algorithm will yield sparsity in z_1i and z_2i, but not
necessarily exact equality of the rows and columns of |
Value
An object of class crispCV
, which can be summarized using summary
, plotted using plot
, and used to predict outcome values for new covariates using predict
.
lambda.cv
: Optimal lambda value chosen by K-fold cross-validation.index.cv
: The index of the model corresponding to the chosen tuning parameter,lambda.cv
. That is,lambda.cv=crisp.out$lambda.seq[index.cv]
.crisp.out
: An object of classcrisp
returned bycrisp
.mean.cv.error
: An m-vector containing cross-validation error where m is the length oflambda.seq
. Note thatmean.cv.error[i]
contains the cross-validation error for the tuning parametercrisp.out$lambda.seq[i]
.se.cv.error
: An m-vector containing cross-validation standard error where m is the length oflambda.seq
. Note thatse.cv.error[i]
contains the standard error of the cross-validation error for the tuning parametercrisp.out$lambda.seq[i]
.Other elements: As specified by the user.
See Also
crisp
, plot
, summary
, predict
, plot.cvError
Examples
## Not run:
#See ?'crisp-package' for a full example of how to use this package
#generate data (using a very small 'n' for illustration purposes)
set.seed(1)
data <- sim.data(n = 15, scenario = 2)
#fit model and select lambda using 2-fold cross-validation
#note: use larger 'n.fold' (e.g., 10) in practice
crispCV.out <- crispCV(X = data$X, y = data$y, n.fold = 2)
## End(Not run)