R: Compare the performance of the NPMC-CX, NPMC-ER, and vanilla...

cv.npcs {npcs}

R Documentation

Compare the performance of the NPMC-CX, NPMC-ER, and vanilla models through cross-validation or bootstrapping methods

Description

Compare the performance of the NPMC-CX, NPMC-ER, and vanilla models through cross-validation or bootstrapping methods. The function will return a summary of evaluation which includes various evaluation metrics, and visualize the class-specific error rates.

Usage

cv.npcs(
  x,
  y,
  classifier,
  alpha,
  w,
  fold = 5,
  stratified = TRUE,
  partition_ratio = 0.7,
  resample = c("bootstrapping", "cv"),
  seed = 1,
  verbose = TRUE,
  plotit = TRUE,
  trControl = list(),
  tuneGrid = list()
)

Arguments

`x`	matrix; the predictor matrix of complete data
`y`	numeric/factor/string; the response vector of complete data.
`classifier`	string; Model to use for npcs function
`alpha`	the levels we want to control for error rates of each class. The length must be equal to the number of classes
`w`	the weights in objective function. Should be a vector of length K, where K is the number of classes.
`fold`	integer; number of folds in CV or number of bootstrapping iterations, default=5
`stratified`	logical; if TRUE, sample will be split into groups based on the proportion of response vector
`partition_ratio`	numeric; the proportion of data to be used for model construction when parameter resample=="bootstrapping"
`resample`	string; the resampling method bootstrapping: bootstrapping, which iteration number is set by parameter "fold" cv: cross validation, the number of folds is set by parameter "fold"
`seed`	random seed
`verbose`	logical; if TRUE, cv.npcs will print the progress. If FALSE, the model will remain silent
`plotit`	logical; if TRUE, the output list will return a box plot summarizing the error rates of vanilla model and NPMC model
`trControl`	list; resampling method within each fold
`tuneGrid`	list; for hyperparameters tuning or setting

Examples

# data generation: case 1 in Tian, Y., & Feng, Y. (2021) with n = 1000
set.seed(123, kind = "L'Ecuyer-CMRG")
train.set <- generate_data(n = 1000, model.no = 1)
x <- train.set$x
y <- train.set$y
test.set <- generate_data(n = 2000, model.no = 1)
x.test <- test.set$x
y.test <- test.set$y
alpha <- c(0.05, NA, 0.01)
w <- c(0, 1, 0)
# contruct the multi-class NP problem

cv.npcs.knn <- cv.npcs(x, y, classifier = "knn", w = w, alpha = alpha)
# result summary and visualization
cv.npcs.knn$summaries
cv.npcs.knn$plot

[Package npcs version 0.1.1 Index]