cv.ecpc {ecpc} | R Documentation |
Cross-validation for 'ecpc'
Description
Cross-validates 'ecpc' and returns model fit, summary statistics and cross-validated performance measures.
Usage
cv.ecpc(Y,X,type.measure=c("MSE","AUC"),outerfolds=10,
lambdas=NULL,ncores=1,balance=TRUE,silent=FALSE,...)
Arguments
Y |
Response data; n-dimensional vector (n: number of samples) for linear and logistic outcomes, or |
X |
Observed data; (nxp)-dimensional matrix (p: number of covariates) with each row the observed high-dimensional feature vector of a sample. |
type.measure |
Type of cross-validated performance measure returned. |
outerfolds |
Number of cross-validation folds. |
lambdas |
A vector of global ridge penalties for each fold; may be given, else estimated. |
ncores |
Number of cores; if larger than 1, the outer cross-validation folds are processed in parallel over 'ncores' clusters. |
balance |
(logistic, Cox) Should folds be balanced in response? |
silent |
Should output messages be suppressed (default FALSE)? |
... |
Additional arguments used in |
Value
A list with the following elements:
ecpc.fit |
List with the ecpc model fit in each fold. |
dfPred |
Data frame with information about out-of-bag predictions. |
dfGrps |
Data frame with information about estimated group and group set weights across folds. |
dfCVM |
Data frame with cross-validated performance metric. |
See Also
Visualise cross-validated group set weights with visualiseGroupsetweights
or group weights with visualiseGroupweights
.
Examples
#####################
# Simulate toy data #
#####################
p<-300 #number of covariates
n<-100 #sample size training data set
n2<-100 #sample size test data set
#simulate all betas i.i.d. from beta_k~N(mean=0,sd=sqrt(0.1)):
muBeta<-0 #prior mean
varBeta<-0.1 #prior variance
indT1<-rep(1,p) #vector with group numbers all 1 (all simulated from same normal distribution)
#simulate test and training data sets:
Dat<-simDat(n,p,n2,muBeta,varBeta,indT1,sigma=1,model='linear')
str(Dat) #Dat contains centered observed data, response data and regression coefficients
##########################
# Make co-data group sets #
##########################
#Group set: G random groups
G <- 5 #number of groups
#sample random categorical co-data:
categoricalRandom <- as.factor(sample(1:G,p,TRUE))
#make group set, i.e. list with G groups:
groupsetRandom <- createGroupset(categoricalRandom)
#######################
# Cross-validate ecpc #
#######################
tic<-proc.time()[[3]]
cv.fit <- cv.ecpc(type.measure="MSE",outerfolds=2,
Y=Dat$Y,X=Dat$Xctd,
groupsets=list(groupsetRandom),
groupsets.grouplvl=list(NULL),
hypershrinkage=c("none"),
model="linear",maxsel=c(5,10,15,20))
toc <- proc.time()[[3]]-tic
str(cv.fit$ecpc.fit) #list containing the model fits on the folds
str(cv.fit$dfPred) #data frame containing information on the predictions
cv.fit$dfCVM #data frame with the cross-validated performance for ecpc
#with/without posterior selection and ordinary ridge