importance {EIX} | R Documentation |
Importance of variables and interactions in the model
Description
This functions calculates a table with selected measures of importance for variables and interactions.
Usage
importance(xgb_model, data, option = "both", digits = 4)
Arguments
xgb_model |
a xgboost or lightgbm model. |
data |
a data table with data used to train the model. |
option |
if "variables" then table includes only single variables, if "interactions", then only interactions if "both", then both single variable and interactions. Default "both". |
digits |
number of significant digits that shall be returned. Will be passed to the signif() functions. |
Details
Available measures:
"sumGain" - sum of Gain value in all nodes, in which given variable occurs,
"sumCover" - sum of Cover value in all nodes, in which given variable occurs; for LightGBM models: number of observation, which pass through the node,
"mean5Gain" - mean gain from 5 occurrences of given variable with the highest gain,
"meanGain" - mean Gain value in all nodes, in which given variable occurs,
"meanCover" - mean Cover value in all nodes, in which given variable occurs; for LightGBM models: mean number of observation, which pass through the node,
"freqency" - number of occurrences in the nodes for given variable.
Additionally for table with single variables:
"meanDepth" - mean depth weighted by gain,
"numberOfRoots" - number of occurrences in the root,
"weightedRoot" - mean number of occurrences in the root, which is weighted by gain.
Value
a data table
Examples
library("EIX")
library("Matrix")
sm <- sparse.model.matrix(left ~ . - 1, data = HR_data)
library("xgboost")
param <- list(objective = "binary:logistic", max_depth = 2)
xgb_model <- xgboost(sm, params = param, label = HR_data[, left] == 1, nrounds = 25, verbose=0)
imp <- importance(xgb_model, sm, option = "both")
imp
plot(imp, top = 10)
imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp, top = nrow(imp))
imp <- importance(xgb_model, sm, option = "interactions")
imp
plot(imp, top = nrow(imp))
imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp, top = NULL, radar = FALSE, xmeasure = "sumCover", ymeasure = "sumGain")