cv.npcs {npcs}R Documentation

Compare the performance of the NPMC-CX, NPMC-ER, and vanilla models through cross-validation or bootstrapping methods

Description

Compare the performance of the NPMC-CX, NPMC-ER, and vanilla models through cross-validation or bootstrapping methods. The function will return a summary of evaluation which includes various evaluation metrics, and visualize the class-specific error rates.

Usage

cv.npcs(
  x,
  y,
  classifier,
  alpha,
  w,
  fold = 5,
  stratified = TRUE,
  partition_ratio = 0.7,
  resample = c("bootstrapping", "cv"),
  seed = 1,
  verbose = TRUE,
  plotit = TRUE,
  trControl = list(),
  tuneGrid = list()
)

Arguments

x

matrix; the predictor matrix of complete data

y

numeric/factor/string; the response vector of complete data.

classifier

string; Model to use for npcs function

alpha

the levels we want to control for error rates of each class. The length must be equal to the number of classes

w

the weights in objective function. Should be a vector of length K, where K is the number of classes.

fold

integer; number of folds in CV or number of bootstrapping iterations, default=5

stratified

logical; if TRUE, sample will be split into groups based on the proportion of response vector

partition_ratio

numeric; the proportion of data to be used for model construction when parameter resample=="bootstrapping"

resample

string; the resampling method

  • bootstrapping: bootstrapping, which iteration number is set by parameter "fold"

  • cv: cross validation, the number of folds is set by parameter "fold"

seed

random seed

verbose

logical; if TRUE, cv.npcs will print the progress. If FALSE, the model will remain silent

plotit

logical; if TRUE, the output list will return a box plot summarizing the error rates of vanilla model and NPMC model

trControl

list; resampling method within each fold

tuneGrid

list; for hyperparameters tuning or setting

Examples

# data generation: case 1 in Tian, Y., & Feng, Y. (2021) with n = 1000
set.seed(123, kind = "L'Ecuyer-CMRG")
train.set <- generate_data(n = 1000, model.no = 1)
x <- train.set$x
y <- train.set$y
test.set <- generate_data(n = 2000, model.no = 1)
x.test <- test.set$x
y.test <- test.set$y
alpha <- c(0.05, NA, 0.01)
w <- c(0, 1, 0)
# contruct the multi-class NP problem

cv.npcs.knn <- cv.npcs(x, y, classifier = "knn", w = w, alpha = alpha)
# result summary and visualization
cv.npcs.knn$summaries
cv.npcs.knn$plot


[Package npcs version 0.1.1 Index]