cross_val {CARRoT}  R Documentation 
Crossvalidation run
Description
Function running a single crossvalidation by partitioning the data into training and test set
Usage
cross_val(
vari,
outi,
c,
rule,
part,
l,
we,
vari_col,
preds,
mode,
cmode,
predm,
cutoff,
objfun,
minx = 1,
maxx = NULL,
nr = NULL,
maxw = NULL,
st = NULL,
corr = 1,
Rsq = F,
marg = 0,
n_tr,
preds_tr
)
Arguments
vari 
set of predictors 
outi 
array of outcomes 
c 
set of all indices of the predictors 
rule 
an Events per Variable (EPV) rule, defaults to 10 
part 
indicates partition of the original dataset into training and test set in a proportion 
l 
number of observations 
we 
weights of the predictors 
vari_col 
overall number of predictors 
preds 
array to write predictions for the test split into, intially empty 
mode 

cmode 

predm 

cutoff 
cutoff value for logistic regression 
objfun 

minx 
minimum number of predictors to be included in a regression, defaults to 1 
maxx 
maximum number of predictors to be included in a regression, defaults to maximum feasible number according to one in ten rule 
nr 
a subset of the dataset, such that 
maxw 
maximum weight of predictors to be included in a regression, defaults to maximum weight according to one in ten rule 
st 
a subset of predictors to be always included into a predictive model,defaults to empty set 
corr 
maximum correlation between a pair of predictors in a model 
Rsq 
whether Rsquared statistics constrained is introduced 
marg 
margin of error for Rsquared statistics constraint 
n_tr 
size of the training set 
preds_tr 
array to write predictions for the training split into, intially empty 
Value
regr 
An M x N matrix of sums of the absolute errors for each element of the test set for each feasible regression. M is maximum feasible number of variables included in a regression, N is the maximum feasible number of regressions of the fixed size; the row index indicates the number of variables included in a regression. Therefore each row corresponds to results obtained from running regressions with the same number of variables and columns correspond to different subsets of predictors used. 
regrr 
An M x N matrix of sums of the relative errors for each element of the test set (only for 
nvar 
Maximum feasible number of variables in the regression 
emp 
An accuracy of always predicting the more likely outcome as suggested by the training set (only for 
In regr
and regrr
NA
values are possible since for some numbers of variables there are fewer feasible regressions than for the others.
See Also
Uses compute_max_weight
, sum_weights_sub
, make_numeric_sets
, get_predictions_lin
, get_predictions
, get_probabilities
, AUC
, combn
Examples
#creating variables
vari<matrix(c(1:100,seq(1,300,3)),ncol=2)
#creating outcomes
out<rbinom(100,1,0.3)
#creating array for predictions
pr<array(NA,c(2,2))
pr_tr<array(NA,c(2,2))
#passing set of the inexes of the predictors
c<c(1:2)
#passing the weights of the predictors
we<c(1,1)
#setting the mode
m<'binary'
#running the function
cross_val(vari,out,c,10,10,100,we,2,pr,m,'det','exact',0.5,'acc',nr=c(1,4),n_tr=90,preds_tr=pr_tr)