get_psi_all {creditmodel}R Documentation

Calculate Population Stability Index (PSI) get_psi is used to calculate Population Stability Index (PSI) of an independent variable. get_psi_all can loop through PSI for all specified independent variables.

Description

Calculate Population Stability Index (PSI) get_psi is used to calculate Population Stability Index (PSI) of an independent variable. get_psi_all can loop through PSI for all specified independent variables.

Usage

get_psi_all(
  dat,
  x_list = NULL,
  target = NULL,
  dat_test = NULL,
  breaks_list = NULL,
  occur_time = NULL,
  start_date = NULL,
  cut_date = NULL,
  oot_pct = 0.7,
  pos_flag = NULL,
  parallel = FALSE,
  ex_cols = NULL,
  as_table = FALSE,
  g = 10,
  bins_no = TRUE,
  note = FALSE
)

get_psi(
  dat,
  x,
  target = NULL,
  dat_test = NULL,
  occur_time = NULL,
  start_date = NULL,
  cut_date = NULL,
  pos_flag = NULL,
  breaks = NULL,
  breaks_list = NULL,
  oot_pct = 0.7,
  g = 10,
  as_table = TRUE,
  note = FALSE,
  bins_no = TRUE
)

Arguments

dat

A data.frame with independent variables and target variable.

x_list

Names of independent variables.

target

The name of target variable.

dat_test

A data.frame of test data. Default is NULL.

breaks_list

A table containing a list of splitting points for each independent variable. Default is NULL.

occur_time

The name of the variable that represents the time at which each observation takes place.

start_date

The earliest occurrence time of observations.

cut_date

Time points for spliting data sets, e.g. : spliting Actual and Expected data sets.

oot_pct

Percentage of observations retained for overtime test (especially to calculate PSI). Defualt is 0.7

pos_flag

Value of positive class, Default is "1".

parallel

Logical, parallel computing. Default is FALSE.

ex_cols

Names of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

as_table

Logical, output results in a table. Default is TRUE.

g

Number of initial breakpoints for equal frequency binning.

bins_no

Logical, add serial numbers to bins. Default is TRUE.

note

Logical, outputs info. Default is TRUE.

x

The name of an independent variable.

breaks

Splitting points for an independent variable. Default is NULL.

Details

PSI Rules for evaluating the stability of a predictor Less than 0.02: Very stable 0.02 to 0.1: Stable 0.1 to 0.2: Unstable 0.2 to 0.5] : Change more than 0.5: Great change

See Also

get_iv,get_iv_all,get_psi,get_psi_all

Examples

#  dat_test is null
get_psi(dat = UCICreditCard, x = "PAY_3", occur_time = "apply_date")
# dat_test is not all
# train_test split
train_test = train_test_split(dat = UCICreditCard, prop = 0.7, split_type = "OOT",
                             occur_time = "apply_date", start_date = NULL, cut_date = NULL,
                            save_data = FALSE, note = FALSE)
dat_ex = train_test$train
dat_ac = train_test$test
# generate psi table
get_psi(dat = dat_ex, dat_test = dat_ac, x = "PAY_3",
       occur_time = "apply_date", bins_no = TRUE)

[Package creditmodel version 1.3.1 Index]