testXGboost {BAGofT}R Documentation

Testing XGboosts

Description

testXGboost specifies an XGboost as the classifier to test. It returns a function that can be taken as the input of ‘testModel’. R package ‘xgboost’ is required.

Usage

testXGboost(formula, params = list(), nrounds = 25)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to test.

params

the list of parameters. The complete list of parameters is available in the online documentation.

nrounds

max number of boosting iterations.

References

Zhang, Ding and Yang (2021) "Is a Classification Procedure Good Enough?-A Goodness-of-Fit Assessment Tool for Classification Learning" arXiv preprint arXiv:1911.03063v2 (2021).

Examples

## Not run: 
###################################################
# Generate a sample dataset.
###################################################
# set the random seed
set.seed(20)
# set the number of observations
n <- 200
# set the number of covariates
p <- 20

# generate covariates data
Xdat <- matrix(runif((n*p), -5,5), nrow = n, ncol = p)
colnames(Xdat) <- paste("x", c(1:p), sep = "")

# generate random coefficients
betaVec <- rnorm(6)
# calculate the linear predictor data
lindat <-  3 * (Xdat[,1] < 2 & Xdat[,1] > -2) + -3 * (Xdat[,1] > 2 | Xdat[,1] < -2) +
  0.5 * (Xdat[,2] + Xdat[, 3] + Xdat[,4] + Xdat[, 5])
# calculate the probabilities
pdat <- 1/(1 + exp(-lindat))

# generate the response data
ydat <- sapply(pdat, function(x) stats :: rbinom(1, 1, x))

# generate the dataset
dat <- data.frame(y = ydat, Xdat)

###################################################
# Obtain the testing result
###################################################

# 50 percent training set
testRes1 <- BAGofT(testModel = testXGboost(formula = y ~.),
                  data = dat,
                  ne = n*0.5,
                  nsplits = 20,
                  nsim = 40)
# 75 percent training set
testRes2 <- BAGofT(testModel = testXGboost(formula = y ~.),
                   data = dat,
                   ne = n*0.75,
                   nsplits = 20,
                   nsim = 40)
# 90 percent training set
testRes3 <- BAGofT(testModel = testXGboost(formula = y ~.),
                   data = dat,
                   ne = n*0.9,
                   nsplits = 20,
                   nsim = 40)

# print the testing result.
print(c(testRes1$p.value, testRes2$p.value, testRes3$p.value))

## End(Not run)

[Package BAGofT version 1.0.0 Index]