germancredit {fairness} | R Documentation |
Modified german credit dataset
Description
germancredit
is a credit scoring data set that can be used to study algorithmic (un)fairness.
This data was used to predict defaults on consumer loans in the German market. In this dataset, a model
to predict default has already been fit and predicted probabilities and predicted status (yes/no)
for default have been concatenated to the original data.
Usage
germancredit
Format
A data frame with 1000 rows and 23 variables:
- Account_status
factor, status of existing checking account
- Duration
numeric, loan duration in month
- Credit_history
factor, previous credit history
- Purpose
factor, loan purpose
- Amount
numeric, credit amount
- Savings
factor, savings account/bonds
- Employment
factor, present employment since
- Installment_rate
numeric, installment rate in percentage of disposable income
- Guarantors
factor, other debtors / guarantors
- Resident_since
factor, present residence since
- Property
factor, property
- Age
numeric, age in years
- Other_plans
factor, other installment plans
- Housing
factor, housing
- Num_credits
numeric, Number of existing credits at this bank
- Job
factor, job
- People_maintenance
numeric, number of people being liable to provide maintenance for
- Phone
factor, telephone
- Foreign
factor, foreign worker
- BAD
factor, GOOD/BAD for whether a customer has defaulted on a loan. This is the outcome or target in this dataset
- Female
factor, female/male for gender
- probability
numeric, predicted probabilities for default, ranges from 0 to 1
- predicted
numeric, predicted values for default, 0/1 for no/yes
Source
The dataset has undergone modifications (e.g. categorical variables were encoded, prediction model was fit and predicted probabilities and predicted status were concatenated to the original dataset).