R: Modified german credit dataset

germancredit {fairness}

R Documentation

Modified german credit dataset

Description

germancredit is a credit scoring data set that can be used to study algorithmic (un)fairness. This data was used to predict defaults on consumer loans in the German market. In this dataset, a model to predict default has already been fit and predicted probabilities and predicted status (yes/no) for default have been concatenated to the original data.

Usage

germancredit

Format

A data frame with 1000 rows and 23 variables:

Account_status: factor, status of existing checking account
Duration: numeric, loan duration in month
Credit_history: factor, previous credit history
Purpose: factor, loan purpose
Amount: numeric, credit amount
Savings: factor, savings account/bonds
Employment: factor, present employment since
Installment_rate: numeric, installment rate in percentage of disposable income
Guarantors: factor, other debtors / guarantors
Resident_since: factor, present residence since
Property: factor, property
Age: numeric, age in years
Other_plans: factor, other installment plans
Housing: factor, housing
Num_credits: numeric, Number of existing credits at this bank
Job: factor, job
People_maintenance: numeric, number of people being liable to provide maintenance for
Phone: factor, telephone
Foreign: factor, foreign worker
BAD: factor, GOOD/BAD for whether a customer has defaulted on a loan. This is the outcome or target in this dataset
Female: factor, female/male for gender
probability: numeric, predicted probabilities for default, ranges from 0 to 1
predicted: numeric, predicted values for default, 0/1 for no/yes

Source

The dataset has undergone modifications (e.g. categorical variables were encoded, prediction model was fit and predicted probabilities and predicted status were concatenated to the original dataset).

[Package fairness version 1.2.2 Index]