gbm_filter {creditmodel}R Documentation

Select Features using GBM

Description

gbm_filter is for selecting important features using GBM.

Usage

gbm_filter(
  dat,
  target = NULL,
  x_list = NULL,
  ex_cols = NULL,
  pos_flag = NULL,
  GBM.params = gbm_params(),
  cores_num = 2,
  vars_name = TRUE,
  note = TRUE,
  save_data = FALSE,
  file_name = NULL,
  dir_path = tempdir(),
  seed = 46,
  ...
)

Arguments

dat

A data.frame with independent variables and target variable.

target

The name of target variable.

x_list

Names of independent variables.

ex_cols

A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

pos_flag

The value of positive class of target variable, default: "1".

GBM.params

Parameters of GBM.

cores_num

The number of CPU cores to use.

vars_name

Logical, output a list of filtered variables or table with detailed IV and PSI value of each variable. Default is TRUE.

note

Logical, outputs info. Default is TRUE.

save_data

Logical, save results results in locally specified folder. Default is FALSE.

file_name

The name for periodically saved results files. Default is "Feature_importance_GBDT".

dir_path

The path for periodically saved results files. Default is "./variable".

seed

Random number seed. Default is 46.

...

Other parameters to pass to gbdt_params.

Value

Selected variables.

See Also

psi_iv_filter, xgb_filter, feature_selector

Examples

GBM.params = gbm_params(n.trees = 2, interaction.depth = 2, shrinkage = 0.1,
                       bag.fraction = 1, train.fraction = 1,
                       n.minobsinnode = 30,
                     cv.folds = 2)
## Not run: 
 features = gbm_filter(dat = UCICreditCard[1:1000, c(8:12, 26)],
         target = "default.payment.next.month",
      occur_time = "apply_date",
     GBM.params = GBM.params
       , vars_name = FALSE)

## End(Not run)

[Package creditmodel version 1.3.0 Index]