woe.binning.deploy {woeBinning} | R Documentation |
Deployment of Binning
Description
woe.binning.deploy
applies the binning solution generated and saved via the woe.binning
or woe.tree.binning
function to (new) data.
Usage
woe.binning.deploy(df, binning, min.iv.total, add.woe.or.dum.var)
Arguments
df |
Name of the data frame the binning solution - that was generated via the function |
binning |
Binning information generated from the |
min.iv.total |
If the IV total value of a binned variable falls below this limit (e.g. 0.1) it will not be added to the data. Just omit this parameter in case you would like to add all binned variables (default). |
add.woe.or.dum.var |
add.woe.or.dum.var=“woe” adds an additional variable with WOE scores
and =“dum” additional dummy variables for each (aggregated) level
of the binned variable. In case of dummy variables make sure that you have set
an appropriate abbrev.fact.levels parameter in the |
General Procedure
woe.binning.deploy
applies the binning information that was generated from the woe.binning
or woe.tree.binning
function to a data frame. In this data frame the names of the variables
to be binned need to be identical to the ones used with the woe.binning
or woe.tree.binning
function. For each variable a binned version will be added.
Optionally a variable with associated weight of evidence (WOE) values or corresponding
dummy variables (one dummy variable for each final bin) are provided.
Handling of Missing Data
In case NAs already occurred during the woe.binning
or woe.tree.binning
binning process the code
‘Missing’ is displayed and a corresponding WOE value can be computed.
In case NAs only occur in the deployment scenario ‘Missing’ is
displayed for numeric variables and ‘unknown’ for factors; and
the corresponding WOE values will be NAs then, as well.
Handling of Unknown Factor Levels
For factor levels that have not been provided in generating the
binning solution via the woe.binning
or woe.tree.binning
function a new factor
level ‘unknown’ is displayed and the corresponding WOE value will be NA.
Examples
# Load German credit data and create a subset
data(germancredit)
df <- germancredit[, c('creditability', 'credit.amount', 'duration.in.month',
'savings.account.and.bonds', 'purpose')]
# Bin all variables of the data frame (apart from the target variable)
# with default parameter settings
binning <- woe.binning(df, 'creditability', df)
# Deploy the binning solution to the data frame
# (add all binned variables and corresponding WOE variables)
df.with.binned.vars.added <- woe.binning.deploy(df, binning,
add.woe.or.dum.var='woe')
# Deploy the binning solution to the data frame
# (add binned variables with IV>=0.1 and corresponding dummy variables)
df.with.binned.vars.added <- woe.binning.deploy(df, binning,
min.iv.total=0.1,
add.woe.or.dum.var='dum')