get_psi_all {creditmodel} | R Documentation |
Calculate Population Stability Index (PSI)
get_psi
is used to calculate Population Stability Index (PSI) of an independent variable.
get_psi_all
can loop through PSI for all specified independent variables.
Description
Calculate Population Stability Index (PSI)
get_psi
is used to calculate Population Stability Index (PSI) of an independent variable.
get_psi_all
can loop through PSI for all specified independent variables.
Usage
get_psi_all(
dat,
x_list = NULL,
target = NULL,
dat_test = NULL,
breaks_list = NULL,
occur_time = NULL,
start_date = NULL,
cut_date = NULL,
oot_pct = 0.7,
pos_flag = NULL,
parallel = FALSE,
ex_cols = NULL,
as_table = FALSE,
g = 10,
bins_no = TRUE,
note = FALSE
)
get_psi(
dat,
x,
target = NULL,
dat_test = NULL,
occur_time = NULL,
start_date = NULL,
cut_date = NULL,
pos_flag = NULL,
breaks = NULL,
breaks_list = NULL,
oot_pct = 0.7,
g = 10,
as_table = TRUE,
note = FALSE,
bins_no = TRUE
)
Arguments
dat |
A data.frame with independent variables and target variable. |
x_list |
Names of independent variables. |
target |
The name of target variable. |
dat_test |
A data.frame of test data. Default is NULL. |
breaks_list |
A table containing a list of splitting points for each independent variable. Default is NULL. |
occur_time |
The name of the variable that represents the time at which each observation takes place. |
start_date |
The earliest occurrence time of observations. |
cut_date |
Time points for spliting data sets, e.g. : spliting Actual and Expected data sets. |
oot_pct |
Percentage of observations retained for overtime test (especially to calculate PSI). Defualt is 0.7 |
pos_flag |
Value of positive class, Default is "1". |
parallel |
Logical, parallel computing. Default is FALSE. |
ex_cols |
Names of excluded variables. Regular expressions can also be used to match variable names. Default is NULL. |
as_table |
Logical, output results in a table. Default is TRUE. |
g |
Number of initial breakpoints for equal frequency binning. |
bins_no |
Logical, add serial numbers to bins. Default is TRUE. |
note |
Logical, outputs info. Default is TRUE. |
x |
The name of an independent variable. |
breaks |
Splitting points for an independent variable. Default is NULL. |
Details
PSI Rules for evaluating the stability of a predictor Less than 0.02: Very stable 0.02 to 0.1: Stable 0.1 to 0.2: Unstable 0.2 to 0.5] : Change more than 0.5: Great change
See Also
get_iv
,get_iv_all
,get_psi
,get_psi_all
Examples
# dat_test is null
get_psi(dat = UCICreditCard, x = "PAY_3", occur_time = "apply_date")
# dat_test is not all
# train_test split
train_test = train_test_split(dat = UCICreditCard, prop = 0.7, split_type = "OOT",
occur_time = "apply_date", start_date = NULL, cut_date = NULL,
save_data = FALSE, note = FALSE)
dat_ex = train_test$train
dat_ac = train_test$test
# generate psi table
get_psi(dat = dat_ex, dat_test = dat_ac, x = "PAY_3",
occur_time = "apply_date", bins_no = TRUE)