irboost {irboost}R Documentation

fit a robust predictive model with iteratively reweighted boosting algorithm

Description

Fit a predictive model with the iteratively reweighted convex optimization (IRCO) that minimizes the robust loss functions in the CC-family (concave-convex). The convex optimization is conducted by functional descent boosting algorithm in the R package xgboost. The iteratively reweighted boosting (IRBoost) algorithm reduces the weight of the observation that leads to a large loss; it also provides weights to help identify outliers. Applications include the robust generalized linear models and extensions, where the mean is related to the predictors by boosting, and robust accelerated failure time models.

Usage

irboost(
  data,
  label,
  weights,
  params = list(),
  z_init = NULL,
  cfun = "ccave",
  s = 1,
  delta = 0.1,
  iter = 10,
  nrounds = 100,
  del = 1e-10,
  trace = FALSE,
  ...
)

Arguments

data

input data, if objective="survival:aft", it must be an xgb.DMatrix; otherwise, it can be a matrix of dimension nobs x nvars; each row is an observation vector. Can accept dgCMatrix

label

response variable. Quantitative for objective="reg:squarederror",
objective="count:poisson" (non-negative counts) or objective="reg:gamma" (positive). For objective="binary:logitraw" or "binary:hinge", label should be a factor with two levels

weights

vector of nobs with non-negative weights

params

the list of parameters, params is passed to function xgboost::xgboost which requires the same argument. The list must include objective, a convex component in the CC-family, the second C, or convex down. It is the same as objective in the xgboost::xgboost. The following objective functions are currently implemented:

  • reg:squarederror Regression with squared loss.

  • binary:logitraw logistic regression for binary classification, predict linear predictor, not probabilies.

  • binary:hinge hinge loss for binary classification. This makes predictions of -1 or 1, rather than producing probabilities.

  • multi:softprob softmax loss function for multiclass problems. The result contains predicted probabilities of each data point in each class, say p_k, k=0, ..., nclass-1. Note, label is coded as in [0, ..., nclass-1]. The loss function cross-entropy for the i-th observation is computed as -log(p_k) with k=lable_i, i=1, ..., n.

  • count:poisson: Poisson regression for count data, predict mean of poisson distribution.

  • reg:gamma: gamma regression with log-link, predict mean of gamma distribution. The implementation in xgboost takes a parameterization in the exponential family:
    xgboost/src/src/metric/elementwise_metric.cu.
    In particularly, there is only one parameter psi and set to 1. The implementation of the IRCO algorithm follows this parameterization. See Table 2.1, McCullagh and Nelder, Generalized linear models, Chapman & Hall, 1989, second edition.

  • reg:tweedie: Tweedie regression with log-link. See also
    tweedie_variance_power in range: (1,2). A value close to 2 is like a gamma distribution. A value close to 1 is like a Poisson distribution.

  • survival:aft: Accelerated failure time model for censored survival time data. irboost invokes irb.train_aft.

z_init

vector of nobs with initial convex component values, must be non-negative with default values = weights if provided, otherwise z_init = vector of 1s

cfun

concave component of CC-family, can be "hacve", "acave", "bcave", "ccave", "dcave", "ecave", "gcave", "hcave".
See Table 2 at https://arxiv.org/pdf/2010.02848.pdf

s

tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program

delta

a small positive number provided by user only if cfun="gcave" and 0 < s <1

iter

number of iteration in the IRCO algorithm

nrounds

boosting iterations within each IRCO iteration

del

convergency criteria in the IRCO algorithm, no relation to delta

trace

if TRUE, fitting progress is reported

...

other arguments passing to xgboost

Value

An object with S3 class xgboost with the additional elments:

Author(s)

Zhu Wang
Maintainer: Zhu Wang zhuwang@gmail.com

References

Wang, Zhu (2021), Unified Robust Boosting, arXiv eprint, https://arxiv.org/abs/2101.07718

Examples


# regression, logistic regression, Poisson regression
x <- matrix(rnorm(100*2),100,2)
g2 <- sample(c(0,1),100,replace=TRUE)
fit1 <- irboost(data=x, label=g2, cfun="acave",s=0.5, 
                params=list(objective="reg:squarederror", max_depth=1), trace=TRUE, 
                verbose=0, nrounds=50)
fit2 <- irboost(data=x, label=g2, cfun="acave",s=0.5, 
                params=list(objective="binary:logitraw", max_depth=1), trace=TRUE,  
                verbose=0, nrounds=50)
fit3 <- irboost(data=x, label=g2, cfun="acave",s=0.5, 
                params=list(objective="binary:hinge", max_depth=1), trace=TRUE,  
                verbose=0, nrounds=50)
fit4 <- irboost(data=x, label=g2, cfun="acave",s=0.5, 
                params=list(objective="count:poisson", max_depth=1), trace=TRUE,      
                verbose=0, nrounds=50)

# Gamma regression
x <- matrix(rnorm(100*2),100,2)
g2 <- sample(rgamma(100, 1))
library("xgboost")
param <- list(objective="reg:gamma", max_depth=1)
fit5 <- xgboost(data=x, label=g2, params=param, nrounds=50)
fit6 <- irboost(data=x, label=g2, cfun="acave",s=5, params=param, trace=TRUE, 
                verbose=0, nrounds=50)
plot(predict(fit5, newdata=x), predict(fit6, newdata=x))
hist(fit6$weight_update)
plot(fit6$loss_log)
summary(fit6$weight_update)

# Tweedie regression 
param <- list(objective="reg:tweedie", max_depth=1)
fit6t <- irboost(data=x, label=g2, cfun="acave",s=5, params=param, 
                 trace=TRUE, verbose=0, nrounds=50)
# Gamma vs Tweedie regression
hist(fit6$weight_update)
hist(fit6t$weight_update)
plot(predict(fit6, newdata=x), predict(fit6t, newdata=x))

# multiclass classification in iris dataset:
lb <- as.numeric(iris$Species)-1
num_class <- 3
set.seed(11)

param <- list(objective="multi:softprob", max_depth=4, eta=0.5, nthread=2, 
subsample=0.5, num_class=num_class)
fit7 <- irboost(data=as.matrix(iris[, -5]), label=lb, cfun="acave", s=50,
                params=param, trace=TRUE, verbose=0, nrounds=10)
# predict for softmax returns num_class probability numbers per case:
pred7 <- predict(fit7, newdata=as.matrix(iris[, -5]))
# reshape it to a num_class-columns matrix
pred7 <- matrix(pred7, ncol=num_class, byrow=TRUE)
# convert the probabilities to softmax labels
pred7_labels <- max.col(pred7) - 1
# classification error: 0!
sum(pred7_labels != lb)/length(lb)
table(lb, pred7_labels)
hist(fit7$weight_update)



[Package irboost version 0.1-1.5 Index]