fifa {DALEX} | R Documentation |
FIFA 20 preprocessed data
Description
The fifa
dataset is a preprocessed players_20.csv
dataset which comes as
a part of "FIFA 20 complete player dataset" at Kaggle.
Usage
data(fifa)
Format
a data frame with 5000 rows, 42 columns and rownames
Details
It contains 5000 'overall' best players and 43 variables. These are:
short_name (rownames)
nationality of the player (not used in modeling)
overall, potential, value_eur, wage_eur (4 potential target variables)
age, height, weight, attacking skills, defending skills, goalkeeping skills (37 variables)
It is advised to leave only one target variable for modeling.
Source: https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset
All transformations:
take 43 columns:
[3, 5, 7:9, 11:14, 45:78]
(R indexing)take rows with
value_eur > 0
convert
short_name
to ASCIIremove rows with duplicated
short_name
(keep first)sort rows on
overall
and take top5000
set
short_name
column as rownamestransform
nationality
to factorreorder columns
Source
The players_20.csv
dataset was downloaded from the Kaggle site and went through few transformations.
The complete dataset was obtained from
https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset#players_20.csv on January 1, 2020.