R: fit a robust accelerated failure time model with iteratively...

irb.train_aft {irboost}

R Documentation

fit a robust accelerated failure time model with iteratively reweighted boosting algorithm

Description

Fit an accelerated failure time model with the iteratively reweighted convex optimization (IRCO) that minimizes the robust loss functions in the CC-family (concave-convex). The convex optimization is conducted by functional descent boosting algorithm in the R package xgboost. The iteratively reweighted boosting (IRBoost) algorithm reduces the weight of the observation that leads to a large loss; it also provides weights to help identify outliers. For time-to-event data, an accelerated failure time model (AFT model) provides an alternative to the commonly used proportional hazards models. Note, function irboost_aft was developed to facilitate a data input format used with function xgb.train for objective=survival:aft in package xgboost. In other ojective functions, the input format is different with function xgboost at the time.

Usage

irb.train_aft(
  params = list(),
  data,
  z_init = NULL,
  cfun = "ccave",
  s = 1,
  delta = 0.1,
  iter = 10,
  nrounds = 100,
  del = 1e-10,
  trace = FALSE,
  ...
)

Arguments

`params`	the list of parameters used in `xgb.train` of xgboost. Must include `aft_loss_distribution`, `aft_loss_distribution_scale`, but there is no need to include `objective`. The complete list of parameters is available in the online documentation.
`data`	training dataset. `irboost_aft` accepts only an `xgb.DMatrix` as the input.
`z_init`	vector of nobs with initial convex component values, must be non-negative with default values = weights if provided, otherwise z_init = vector of 1s
`cfun`	concave component of CC-family, can be `"hacve", "acave", "bcave", "ccave"`, `"dcave", "ecave", "gcave", "hcave"`. See Table 2 at https://arxiv.org/pdf/2010.02848.pdf
`s`	tuning parameter of `cfun`. `s > 0` and can be equal to 0 for `cfun="tcave"`. If `s` is too close to 0 for `cfun="acave", "bcave", "ccave"`, the calculated weights can become 0 for all observations, thus crash the program
`delta`	a small positive number provided by user only if `cfun="gcave"` and `0 < s <1`
`iter`	number of iteration in the IRCO algorithm
`nrounds`	boosting iterations in `xgb.train` within each IRCO iteration
`del`	convergency criteria in the IRCO algorithm, no relation to `delta`
`trace`	if `TRUE`, fitting progress is reported
`...`	other arguments passing to `xgb.train`

Value

An object of class xgb.Booster with additional elements:

weight_update_log a matrix of nobs row by iter column of observation weights in each iteration of the IRCO algorithm
weight_update a vector of observation weights in the last IRCO iteration that produces the final model fit
loss_log sum of loss value of the composite function cfun(survival_aft_distribution) in each IRCO iteration

Author(s)

Zhu Wang
Maintainer: Zhu Wang zhuwang@gmail.com

References

Wang, Zhu (2021), Unified Robust Boosting, arXiv eprint, https://arxiv.org/abs/2101.07718

Examples


library("xgboost")
X <- matrix(1:5, ncol=1)

# Associate ranged labels with the data matrix.
# This example shows each kind of censored labels.
#          uncensored  right  left  interval
y_lower = c(10,  15, -Inf, 30, 100)
y_upper = c(Inf, Inf,   20, 50, Inf)
dtrain <- xgb.DMatrix(data=X, label_lower_bound=y_lower, label_upper_bound=y_upper)
                      params = list(objective="survival:aft", aft_loss_distribution="normal",
                      aft_loss_distribution_scale=1, max_depth=3, min_child_weight= 0)
watchlist <- list(train = dtrain)
bst <- xgb.train(params, data=dtrain, nrounds=15, watchlist=watchlist)
predict(bst, dtrain)
bst_cc <- irb.train_aft(params, data=dtrain, nrounds=15, watchlist=watchlist, cfun="hcave", 
                       s=1.5, trace=TRUE, verbose=0)
bst_cc$weight_update
predict(bst_cc, dtrain)

[Package irboost version 0.1-1.5 Index]