fifa {DALEX}R Documentation

FIFA 20 preprocessed data

Description

The fifa dataset is a preprocessed players_20.csv dataset which comes as a part of "FIFA 20 complete player dataset" at Kaggle.

Usage

data(fifa)

Format

a data frame with 5000 rows, 42 columns and rownames

Details

It contains 5000 'overall' best players and 43 variables. These are:

It is advised to leave only one target variable for modeling.

Source: https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset

All transformations:

  1. take 43 columns: [3, 5, 7:9, 11:14, 45:78] (R indexing)

  2. take rows with value_eur > 0

  3. convert short_name to ASCII

  4. remove rows with duplicated short_name (keep first)

  5. sort rows on overall and take top 5000

  6. set short_name column as rownames

  7. transform nationality to factor

  8. reorder columns

Source

The players_20.csv dataset was downloaded from the Kaggle site and went through few transformations. The complete dataset was obtained from https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset#players_20.csv on January 1, 2020.


[Package DALEX version 2.4.3 Index]