importance {EIX} R Documentation

Importance of variables and interactions in the model

Description

This functions calculates a table with selected measures of importance for variables and interactions.

Usage

importance(xgb_model, data, option = "both", digits = 4)


Arguments

 xgb_model a xgboost or lightgbm model. data a data table with data used to train the model. option if "variables" then table includes only single variables, if "interactions", then only interactions if "both", then both single variable and interactions. Default "both". digits number of significant digits that shall be returned. Will be passed to the signif() functions.

Details

Available measures:

• "sumGain" - sum of Gain value in all nodes, in which given variable occurs,

• "sumCover" - sum of Cover value in all nodes, in which given variable occurs; for LightGBM models: number of observation, which pass through the node,

• "mean5Gain" - mean gain from 5 occurrences of given variable with the highest gain,

• "meanGain" - mean Gain value in all nodes, in which given variable occurs,

• "meanCover" - mean Cover value in all nodes, in which given variable occurs; for LightGBM models: mean number of observation, which pass through the node,

• "freqency" - number of occurrences in the nodes for given variable.

Additionally for table with single variables:

• "meanDepth" - mean depth weighted by gain,

• "numberOfRoots" - number of occurrences in the root,

• "weightedRoot" - mean number of occurrences in the root, which is weighted by gain.

a data table

Examples

library("EIX")
library("Matrix")
sm <- sparse.model.matrix(left ~ . - 1,  data = HR_data)

library("xgboost")
param <- list(objective = "binary:logistic", max_depth = 2)
xgb_model <- xgboost(sm, params = param, label = HR_data[, left] == 1, nrounds = 25, verbose=0)

imp <- importance(xgb_model, sm, option = "both")
imp
plot(imp,  top = 10)

imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp,  top = nrow(imp))

imp <- importance(xgb_model, sm, option = "interactions")
imp
plot(imp,  top =  nrow(imp))

imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp, top = NULL, radar = FALSE, xmeasure = "sumCover", ymeasure = "sumGain")



[Package EIX version 1.2.0 Index]