a-compatibility-note-for-saveRDS-save {xgboost} | R Documentation |
Do not use saveRDS
or save
for long-term archival of
models. Instead, use xgb.save
or xgb.save.raw
.
Description
It is a common practice to use the built-in saveRDS
function (or
save
) to persist R objects to the disk. While it is possible to persist
xgb.Booster
objects using saveRDS
, it is not advisable to do so if
the model is to be accessed in the future. If you train a model with the current version of
XGBoost and persist it with saveRDS
, the model is not guaranteed to be
accessible in later releases of XGBoost. To ensure that your model can be accessed in future
releases of XGBoost, use xgb.save
or xgb.save.raw
instead.
Details
Use xgb.save
to save the XGBoost model as a stand-alone file. You may opt into
the JSON format by specifying the JSON extension. To read the model back, use
xgb.load
.
Use xgb.save.raw
to save the XGBoost model as a sequence (vector) of raw bytes
in a future-proof manner. Future releases of XGBoost will be able to read the raw bytes and
re-construct the corresponding model. To read the model back, use xgb.load.raw
.
The xgb.save.raw
function is useful if you'd like to persist the XGBoost model
as part of another R object.
Note: Do not use xgb.serialize
to store models long-term. It persists not only the
model but also internal configurations and parameters, and its format is not stable across
multiple XGBoost versions. Use xgb.serialize
only for checkpointing.
For more details and explanation about model persistence and archival, consult the page https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html.
Examples
data(agaricus.train, package='xgboost')
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")
# Save as a stand-alone file; load it with xgb.load()
xgb.save(bst, 'xgb.model')
bst2 <- xgb.load('xgb.model')
# Save as a stand-alone file (JSON); load it with xgb.load()
xgb.save(bst, 'xgb.model.json')
bst2 <- xgb.load('xgb.model.json')
if (file.exists('xgb.model.json')) file.remove('xgb.model.json')
# Save as a raw byte vector; load it with xgb.load.raw()
xgb_bytes <- xgb.save.raw(bst)
bst2 <- xgb.load.raw(xgb_bytes)
# Persist XGBoost model as part of another R object
obj <- list(xgb_model_bytes = xgb.save.raw(bst), description = "My first XGBoost model")
# Persist the R object. Here, saveRDS() is okay, since it doesn't persist
# xgb.Booster directly. What's being persisted is the future-proof byte representation
# as given by xgb.save.raw().
saveRDS(obj, 'my_object.rds')
# Read back the R object
obj2 <- readRDS('my_object.rds')
# Re-construct xgb.Booster object from the bytes
bst2 <- xgb.load.raw(obj2$xgb_model_bytes)
if (file.exists('my_object.rds')) file.remove('my_object.rds')