segment.vld {PDtoolkit}R Documentation

Model segment validation

Description

segment.vld performs model segment validation based on residuals. The main goal of this procedure is to identify segments where model in use overestimates or underestimates the observed default rate. The procedure consists of a few steps. The first step is to calculate the model residuals (observed default indicator minus estimated probability). Then, on obtained residuals, the regression tree is fitted for segment identification. Finally, one proportion test is applied in order to test overestimation or underestimation of the observed default rate within these segments. Results of this validation can indicate omission of some important risk factor(s) or some specific sub-portfolio for which model performs worse than for the rest of the portfolio.

Usage

segment.vld(model, db, min.leaf = 0.03, alpha = 0.05)

Arguments

model

Model in use, an object of class inheriting from "glm"

db

Modeling data with risk factors and target variable. Risk factors used for model development have to be of the same type (if WoE coding is used it has to be numeric with WoE values). Additionally, the rest of the risk factors (these that are supplied in db, but not used for model development) will be used for segment validation.

min.leaf

Minimum percentage of observations per leaf. Default is 0.03.

alpha

Significance level of p-value for one proportion test. Default is 0.05.

Value

The command segment.vld returns a list of three objects.
The first object (segment.model), returns regression tree results (rpart object).
The second object (segment.testing), is the data frame with segment overview and testing results.
The third object (segment.rules), is the data frame with average residual rate and rules for segment identification. This elements is returned, only if the segments are identified, otherwise it isNULL.

Examples

suppressMessages(library(PDtoolkit))
library(rpart)
data(loans)
#run stepMIV
res <- stepFWD(start.model = Creditability ~ 1, 
              p.value = 0.05,
	   coding = "WoE",
	   db = loans)
#check output elements
names(res)
#extract the final model
final.model <- res$model
#print coefficients
summary(final.model)$coefficients
#run segment validation procedure
seg.analysis <- segment.vld(model = final.model, 
				db = res$dev.db,
				min.leaf = 0.03,
				alpha = 0.05)
#check output elements
names(seg.analysis)
#print segment model - regression tree
seg.analysis$segment.model
#print segment summary and statistical testing
seg.analysis$segment.testing
#print segment identification rules
seg.analysis$segment.rules

[Package PDtoolkit version 1.2.0 Index]