homogeneity {PDtoolkit} | R Documentation |
Testing homogeneity of the PD rating model
Description
homogeneity
performs homogeneity testing of PD model based on the rating grades and selected segment.
This test is usually applied on application portfolio, but it can be applied also on model development sample.
Additionally, this method requires higher number of observations per segment modalities within each rating in order
to produce available results. For segments with less than 30 observations, test is not performed.
If as a segment user selects numeric variable from the application portfolio, variable will be grouped in selected number of
groups (argument segment.num
).
Usage
homogeneity(app.port, def.ind, rating, segment, segment.num, alpha = 0.05)
Arguments
app.port |
Application portfolio (data frame) which contains default indicator (0/1), ratings in use and variable used as a segment. |
def.ind |
Name of the column that represents observed default indicator (0/1). |
rating |
Name of the column that represent rating grades in use. |
segment |
Name of the column that represent testing segments. If it is of numeric type, than it is first grouped
into |
segment.num |
Number of groups used for numeric variables supplied as a segment. Only applicable if |
alpha |
Significance level of p-value for two proportion test. Default is 0.05. |
Details
Testing procedure is implemented for each rating separately comparing default rate from one segment modality to the default rate from the rest of segment modalities.
Value
The command homogeneity
returns a data frame with the following columns:
segment.var: Variable used as a segment.
rating: Unique values of rating grades from application portfolio..
segment.mod: Tested segment modality. Default rate from this segment is compared with default rate from the rest of the modalities within the each rating.
no: Number of observations of the analyzed rating.
nb: Number of defaults (bad cases) of the analyzed rating.
no.segment: Number of observations of the analyzed segment modality.
no.rest: Number of observations of the rest of the segment modalities.
nb.segment: Number of defaults of the analyzed segment modality.
nb.rest: Number of defaults of the rest of the segment modalities.
p.val: Two proportion test (two sided) p-value.
alpha: Selected significance level.
res: Accepted hypothesis.
Examples
suppressMessages(library(PDtoolkit))
data(loans)
#estimate some dummy model
mod.frm <- `Creditability` ~ `Account Balance` + `Duration of Credit (month)` +
`Age (years)` + `Value Savings/Stocks` +
`Duration in Current address`
lr.mod <- glm(mod.frm, family = "binomial", data = loans)
summary(lr.mod)$coefficients
#model predictions
loans$pred <- unname(predict(lr.mod, type = "response", newdata = loans))
#scale probabilities
loans$score <- scaled.score(probs = loans$pred, score = 600, odd = 50/1, pdo = 20)
#group scores into ratings
loans$rating <- ndr.bin(x = round(loans$score), y = loans$Creditability, y.type = "bina")[[2]]
#simulate dummy application portfolio (oversample loans data)
set.seed(2211)
app.port <- loans[sample(1:nrow(loans), 2500, rep = TRUE), ]
#run homogeneity test on ratings based on the Credit Amount segments
homogeneity(app.port = app.port,
def.ind = "Creditability",
rating = "rating",
segment = "Credit Amount",
segment.num = 4,
alpha = 0.05)