replace.woe {PDtoolkit} | R Documentation |
Replace modalities of risk factor with weights of evidence (WoE) value
Description
replace.woe
replaces modalities of risk factor with calculated WoE value. This function process only
categorical risk factors, thus it is assumed that numerical risk factors are previously categorized.
Additional info report (second element of function output - info
data frame), if produced, includes:
rf: Risk factor name.
reason.code: Reason code takes value 1 if inappropriate class of risk factor is identified. It takes value 2 if maximum number of categories exceeds 10, while 3 if there are any problem with weights of evidence (WoE) calculations (usually if any bin contains only good or bad cases). If validation 1 and 3 are observed, risk factor is not process for WoE replacement.
comment: Reason description.
Usage
replace.woe(db, target)
Arguments
db |
Data frame of categorical risk factors and target variable supplied for WoE coding. |
target |
Name of target variable within |
Value
The command replace.woe
returns the list of two data frames. The first one contains WoE replacement
of analyzed risk factors' modalities, while the second data frame reports results of above
mentioned validations regarding class of the risk factors, number of modalities and WoE calculation.
Examples
suppressMessages(library(PDtoolkit))
data(gcd)
#categorize numeric risk factor
gcd$maturity.bin <- ndr.bin(x = gcd$maturity, y = gcd$qual, y.type = "bina")[[2]]
gcd$amount.bin <- ndr.bin(x = gcd$amount, y = gcd$qual, y.type = "bina")[[2]]
gcd$age.bin <- ndr.bin(x = gcd$age, y = gcd$qual, y.type = "bina")[[2]]
head(gcd)
#replace modalities with WoE values
woe.rep <- replace.woe(db = gcd, target = "qual")
#results overview
head(woe.rep[[1]])
woe.rep[[2]]