regr_ind {CARRoT}  R Documentation 
One of the two main functions of the package. Identifies the predictors included into regressions with the highest average predictive power
regr_ind( vari, outi, crv, cutoff = NULL, part = 10, mode, cmode = "det", predm = "exact", objfun = "acc", parallel = FALSE, cores, minx = 1, maxx = NULL, nr = NULL, maxw = NULL, st = NULL, rule = 10, corr = 1 )
vari 
set of predictors 
outi 
array of outcomes 
crv 
number of crossvalidations 
cutoff 
cutoff value for mode 
part 
for each crossvalidation partitions the dataset into training and test set in a proportion 
mode 

cmode 

predm 

objfun 

parallel 
TRUE if using parallel toolbox, FALSE if not. Defaults to FALSE 
cores 
number of cores to use in case of parallel=TRUE 
minx 
minimum number of predictors to be included in a regression, defaults to 1 
maxx 
maximum number of predictors to be included in a regression, defaults to maximum feasible number according to one in ten rule 
nr 
a subset of the dataset, such that 
maxw 
maximum weight of predictors to be included in a regression, defaults to maximum weight according to one in ten rule 
st 
a subset of predictors to be always included into a predictive model,defaults to empty set 
rule 
an Events per Variable (EPV) rule, defaults to 10' 
corr 
maximum correlation between a pair of predictors in a model 
Prints the best predictive power provided by a regression, predictive accuracy of the empirical prediction (value of emp
computed by cross_val
for logistic and linear regression). Returns indices of the predictors included into regressions with the highest predictive power written in a list. For mode='linear'
outputs a list of two lists. First list corresponds to the smallest absolute error, second corresponds to the smallest relative error
Uses compute_weights
, make_numeric
, compute_max_weight
, compute_weights
, compute_max_length
, cross_val
,av_out
, get_indices
#creating variables for linear regression mode variables_lin<matrix(c(rnorm(56,0,1),rnorm(56,1,2)),ncol=2) #creating outcomes for linear regression mode outcomes_lin<rnorm(56,2,1) #running the function regr_ind(variables_lin,outcomes_lin,100,mode='linear',parallel=TRUE,cores=2) #creating variables for binary mode vari<matrix(c(1:100,seq(1,300,3)),ncol=2) #creating outcomes for binary mode out<rbinom(100,1,0.3) #running the function regr_ind(vari,out,20,cutoff=0.5,part=10,mode='binary',parallel=TRUE,cores=2,nr=c(1,10,20),maxx=1)