pmml.xgb.Booster {pmml} | R Documentation |
Generate PMML for a xgb.Booster object from the package xgboost.
Description
Generate PMML for a xgb.Booster object from the package xgboost.
Usage
## S3 method for class 'xgb.Booster'
pmml(
model,
model_name = "xboost_Model",
app_name = "SoftwareAG PMML Generator",
description = "Extreme Gradient Boosting Model",
copyright = NULL,
model_version = NULL,
transforms = NULL,
missing_value_replacement = NULL,
input_feature_names = NULL,
output_label_name = NULL,
output_categories = NULL,
xgb_dump_file = NULL,
parent_invalid_value_treatment = "returnInvalid",
child_invalid_value_treatment = "asIs",
...
)
Arguments
model |
An object created by the 'xgboost' function. |
model_name |
A name to be given to the PMML model. |
app_name |
The name of the application that generated the PMML. |
description |
A descriptive text for the Header element of the PMML. |
copyright |
The copyright notice for the model. |
model_version |
A string specifying the model version. |
transforms |
Data transformations. |
missing_value_replacement |
Value to be used as the 'missingValueReplacement' attribute for all MiningFields. |
input_feature_names |
Input variable names used in training the model. |
output_label_name |
Name of the predicted field. |
output_categories |
Possible values of the predicted field, for classification models. |
xgb_dump_file |
Name of file saved using 'xgb.dump' function. |
parent_invalid_value_treatment |
Invalid value treatment at the top MiningField level. |
child_invalid_value_treatment |
Invalid value treatment at the model segment MiningField level. |
... |
Further arguments passed to or from other methods. |
Details
The xgboost
function takes as its input either an xgb.DMatrix
object or
a numeric matrix. The input field information is not stored in the R model object,
hence the field information must be passed on as inputs. This enables the PMML
to specify field names in its model representation. The R model object does not store
information about the fitted tree structure either. However, this information can
be extracted from the xgb.model.dt.tree
function and the file saved using the
xgb.dump
function. The xgboost library is therefore needed in the environment and this
saved file is needed as an input as well.
The following objectives are currently supported: multi:softprob
,
multi:softmax
, binary:logistic
.
The pmml exporter will throw an error if the xgboost model model only has one tree.
The exporter only works with numeric matrices. Sparse matrices must be converted to
matrix
objects before training an xgboost model for the export to work correctly.
Value
PMML representation of the xgb.Booster object.
Author(s)
Tridivesh Jena
References
xgboost: Extreme Gradient Boosting
See Also
Examples
## Not run:
# Example using the xgboost package example model.
library(xgboost)
data(agaricus.train, package = "xgboost")
data(agaricus.test, package = "xgboost")
train <- agaricus.train
test <- agaricus.test
model1 <- xgboost(
data = train$data, label = train$label,
max_depth = 2, eta = 1, nthread = 2,
nrounds = 2, objective = "binary:logistic"
)
# Save the tree information in an external file:
xgb.dump(model1, "model1.dumped.trees")
# Convert to PMML:
model1_pmml <- pmml(model1,
input_feature_names = colnames(train$data),
output_label_name = "prediction1",
output_categories = c("0", "1"),
xgb_dump_file = "model1.dumped.trees"
)
# Multinomial model using iris data:
model2 <- xgboost(
data = as.matrix(iris[, 1:4]),
label = as.numeric(iris[, 5]) - 1,
max_depth = 2, eta = 1, nthread = 2, nrounds = 2,
objective = "multi:softprob", num_class = 3
)
# Save the tree information in an external file:
xgb.dump(model2, "model2.dumped.trees")
# Convert to PMML:
model2_pmml <- pmml(model2,
input_feature_names = colnames(as.matrix(iris[, 1:4])),
output_label_name = "Species",
output_categories = c(1, 2, 3), xgb_dump_file = "model2.dumped.trees"
)
## End(Not run)