germancredit {fairness}R Documentation

Modified german credit dataset

Description

germancredit is a credit scoring data set that can be used to study algorithmic (un)fairness. This data was used to predict defaults on consumer loans in the German market. In this dataset, a model to predict default has already been fit and predicted probabilities and predicted status (yes/no) for default have been concatenated to the original data.

Usage

germancredit

Format

A data frame with 1000 rows and 23 variables:

Account_status

factor, status of existing checking account

Duration

numeric, loan duration in month

Credit_history

factor, previous credit history

Purpose

factor, loan purpose

Amount

numeric, credit amount

Savings

factor, savings account/bonds

Employment

factor, present employment since

Installment_rate

numeric, installment rate in percentage of disposable income

Guarantors

factor, other debtors / guarantors

Resident_since

factor, present residence since

Property

factor, property

Age

numeric, age in years

Other_plans

factor, other installment plans

Housing

factor, housing

Num_credits

numeric, Number of existing credits at this bank

Job

factor, job

People_maintenance

numeric, number of people being liable to provide maintenance for

Phone

factor, telephone

Foreign

factor, foreign worker

BAD

factor, GOOD/BAD for whether a customer has defaulted on a loan. This is the outcome or target in this dataset

Female

factor, female/male for gender

probability

numeric, predicted probabilities for default, ranges from 0 to 1

predicted

numeric, predicted values for default, 0/1 for no/yes

Source

The dataset has undergone modifications (e.g. categorical variables were encoded, prediction model was fit and predicted probabilities and predicted status were concatenated to the original dataset).


[Package fairness version 1.2.2 Index]