CSMES.ensNomCurve {CSMES} | R Documentation |
CSMES Training Stage 2: Extract an ensemble nomination curve (cost curve- or Brier curve-based) from a set of Pareto-optimal ensemble classifiers
Description
Generates an ensemble nomination curve from a set of Pareto-optimal ensemble definitions as identified through CSMES.ensSel)
.
Usage
CSMES.ensNomCurve(
ensSelModel,
memberPreds,
y,
curveType = c("costCurve", "brierSkew", "brierCost"),
method = c("classPreds", "probPreds"),
plotting = FALSE,
nrBootstraps = 1
)
Arguments
ensSelModel |
ensemble selection model (output of |
memberPreds |
matrix containing ensemble member library predictions |
y |
Vector with true class labels. Currently, a dichotomous outcome variable is supported |
curveType |
the type of cost curve used to construct the ensemble nomination curve. Shoul be "brierCost","brierSkew" or "costCurve" (default). |
method |
how are candidate ensemble learner predictions used to generate the ensemble nomination front? "classPreds" for class predictions (default), "probPreds" for probability predictions. |
plotting |
|
nrBootstraps |
optionally, the ensemble nomination curve can be generated through bootstrapping. This argument specifies the number of iterations/bootstrap samples. Default is 1. |
Value
An object of the class CSMES.ensNomCurve
which is a list with the following components:
nomcurve |
the ensemble nomination curve |
curves |
individual cost curves or brier curves of ensemble members |
intervals |
resolution of the ensemble nomination curve |
incidence |
incidence (positive rate) of the outcome variable |
area_under_curve |
area under the ensemble nomination curve |
method |
method used to generate the ensemble nomination front:"classPreds" for class predictions (default), "probPreds" for probability predictions |
curveType |
the type of cost curve used to construct the ensemble nomination curve |
nrBootstraps |
number of boostrap samples over which the ensemble nomination curve was estimated |
Author(s)
Koen W. De Bock, kdebock@audencia.com
References
De Bock, K.W., Lessmann, S. And Coussement, K., Cost-sensitive business failure prediction when misclassification costs are uncertain: A heterogeneous ensemble selection approach, European Journal of Operational Research (2020), doi: 10.1016/j.ejor.2020.01.052.
See Also
CSMES.ensSel
, CSMES.predictPareto
, CSMES.predict
Examples
##load data
library(rpart)
library(zoo)
library(ROCR)
library(mco)
data(BFP)
##generate random order vector
BFP_r<-BFP[sample(nrow(BFP),nrow(BFP)),]
size<-nrow(BFP_r)
##size<-300
train<-BFP_r[1:floor(size/3),]
val<-BFP_r[ceiling(size/3):floor(2*size/3),]
test<-BFP_r[ceiling(2*size/3):size,]
##generate a list containing model specifications for 100 CART decisions trees varying in the cp
##and minsplit parameters, and trained on bootstrap samples (bagging)
rpartSpecs<-list()
for (i in 1:100){
data<-train[sample(1:ncol(train),size=ncol(train),replace=TRUE),]
str<-paste("rpartSpecs$rpart",i,"=rpart(as.formula(Class~.),data,method=\"class\",
control=rpart.control(minsplit=",round(runif(1, min = 1, max = 20)),",cp=",runif(1,
min = 0.05, max = 0.4),"))",sep="")
eval(parse(text=str))
}
##generate predictions for these models
hillclimb<-mat.or.vec(nrow(val),100)
for (i in 1:100){
str<-paste("hillclimb[,",i,"]=predict(rpartSpecs[[i]],newdata=val)[,2]",sep="")
eval(parse(text=str))
}
##score the validation set used for ensemble selection, to be used for ensemble selection
ESmodel<-CSMES.ensSel(hillclimb,val$Class,obj1="FNR",obj2="FPR",selType="selection",
generations=10,popsize=12,plot=TRUE)
## Create Ensemble nomination curve
enc<-CSMES.ensNomCurve(ESmodel,hillclimb,val$Class,curveType="costCurve",method="classPreds",
plot=FALSE)