split_bins_all {creditmodel} | R Documentation |
Split bins all
Description
split_bins
is for transforming data to bins.
The split_bins_all
function is a simpler wrapper for split_bins
.
Usage
split_bins_all(
dat,
x_list = NULL,
ex_cols = NULL,
breaks_list = NULL,
bins_no = TRUE,
note = FALSE,
return_x = FALSE,
char_free = FALSE,
save_data = FALSE,
file_name = NULL,
dir_path = tempdir(),
...
)
Arguments
dat |
A data.frame with independent variables. |
x_list |
A list of x variables. |
ex_cols |
Names of excluded variables. Regular expressions can also be used to match variable names. Default is NULL. |
breaks_list |
A list contains breaks of variables. it is generated by codeget_breaks_all,codeget_breaks |
bins_no |
Number the generated bins. Default is TRUE. |
note |
Logical, outputs info. Default is TRUE. |
return_x |
Logical, return data.frame containing only variables in x_list. |
char_free |
Logical, if TRUE, characters are not splitted. |
save_data |
Logical, save results in locally specified folder. Default is TRUE |
file_name |
The name for periodically saved woe file. Default is "dat_woe". |
dir_path |
The path for periodically saved woe file Default is "./data" |
... |
Additional parameters. |
Value
A data.frame with splitted bins.
See Also
get_tree_breaks
, cut_equal
, select_best_class
, select_best_breaks
Examples
sub = cv_split(UCICreditCard, k = 30)[[1]]
dat = UCICreditCard[sub,]
dat = re_name(dat, "default.payment.next.month", "target")
dat = data_cleansing(dat, target = "target", obs_id = "ID", occur_time = "apply_date",
miss_values = list("", -1))
train_test = train_test_split(dat, split_type = "OOT", prop = 0.7,
occur_time = "apply_date")
dat_train = train_test$train
dat_test = train_test$test
#get breaks of all predictive variables
x_list = c("PAY_0", "LIMIT_BAL", "PAY_AMT5", "EDUCATION", "PAY_3", "PAY_2")
breaks_list = get_breaks_all(dat = dat_train, target = "target",
x_list = x_list, occur_time = "apply_date", ex_cols = "ID",
save_data = FALSE, note = FALSE)
#woe transform
train_bins = split_bins_all(dat = dat_train,
breaks_list = breaks_list,
woe_name = FALSE)
test_bins = split_bins_all(dat = dat_test,
breaks_list = breaks_list,
note = FALSE)