R: Testing homogeneity of the PD rating model

homogeneity {PDtoolkit}

R Documentation

Testing homogeneity of the PD rating model

Description

homogeneity performs homogeneity testing of PD model based on the rating grades and selected segment. This test is usually applied on application portfolio, but it can be applied also on model development sample. Additionally, this method requires higher number of observations per segment modalities within each rating in order to produce available results. For segments with less than 30 observations, test is not performed. If as a segment user selects numeric variable from the application portfolio, variable will be grouped in selected number of groups (argument segment.num).

Usage

homogeneity(app.port, def.ind, rating, segment, segment.num, alpha = 0.05)

Arguments

`app.port`	Application portfolio (data frame) which contains default indicator (0/1), ratings in use and variable used as a segment.
`def.ind`	Name of the column that represents observed default indicator (0/1).
`rating`	Name of the column that represent rating grades in use.
`segment`	Name of the column that represent testing segments. If it is of numeric type, than it is first grouped into `segment.num` of groups otherwise is it used as supplied.
`segment.num`	Number of groups used for numeric variables supplied as a segment. Only applicable if `segment` is of numeric type.
`alpha`	Significance level of p-value for two proportion test. Default is 0.05.

Details

Testing procedure is implemented for each rating separately comparing default rate from one segment modality to the default rate from the rest of segment modalities.

Value

The command homogeneity returns a data frame with the following columns:

segment.var: Variable used as a segment.
rating: Unique values of rating grades from application portfolio..
segment.mod: Tested segment modality. Default rate from this segment is compared with default rate from the rest of the modalities within the each rating.
no: Number of observations of the analyzed rating.
nb: Number of defaults (bad cases) of the analyzed rating.
no.segment: Number of observations of the analyzed segment modality.
no.rest: Number of observations of the rest of the segment modalities.
nb.segment: Number of defaults of the analyzed segment modality.
nb.rest: Number of defaults of the rest of the segment modalities.
p.val: Two proportion test (two sided) p-value.
alpha: Selected significance level.
res: Accepted hypothesis.

Examples

suppressMessages(library(PDtoolkit))
data(loans)
#estimate some dummy model
mod.frm <- `Creditability` ~ `Account Balance` + `Duration of Credit (month)` +
				`Age (years)` + `Value Savings/Stocks` + 
				`Duration in Current address`
lr.mod <- glm(mod.frm, family = "binomial", data = loans)
summary(lr.mod)$coefficients
#model predictions
loans$pred <-  unname(predict(lr.mod, type = "response", newdata = loans))
#scale probabilities
loans$score <- scaled.score(probs = loans$pred, score = 600, odd = 50/1, pdo = 20)
#group scores into ratings
loans$rating <- ndr.bin(x = round(loans$score), y = loans$Creditability, y.type = "bina")[[2]]
#simulate dummy application portfolio (oversample loans data) 
set.seed(2211)
app.port <- loans[sample(1:nrow(loans), 2500, rep = TRUE), ]
#run homogeneity test on ratings based on the Credit Amount segments
homogeneity(app.port = app.port, 
	def.ind = "Creditability", 
	rating = "rating", 
	segment = "Credit Amount",
	segment.num = 4, 
	alpha = 0.05)

[Package PDtoolkit version 1.2.0 Index]