toyData {dataMaid} | R Documentation |
Small example data to show the features of dataMaid
Description
An artificial dataset, intended for presenting the key features of dataMaid
, which is a
toolset for identifying potential errors in a dataset.
Usage
toyData
Format
A data.frame
with 15 rows and 6 variables.
- pill
A factor variable with two levels (
"red"
and"blue"
) and a few (correctly coded) missing observations. This represents the colour of a pill.- events
A numeric variable with one obvious outlier value (
82
), two miscoded missing values (999
andNaN
) and a few correctly coded missing values. The number of previous events.- region
A factor variable where two of the levels (
"other"
and"OTHER"
are the same word with different case settings. Moreover, the variable includes a Stata-style miscoded missing value ("."
). Used to represent geographical regions or treatment centers.
.
- change
A numeric variable (random draws from a standard normal distribution). Representing a change in a measured variable.
- id
A factor variable with unique codes for each observation (a character string with a number between 1 and 15), i.e. a key variable.
- spotifysong
A factor variable that has the same level (
"Irrelevant"
) for all observations, i.e. a empty variable. The latest song played on Spotify.
Source
Artificial data
References
Petersen AH, Ekstrøm CT (2019). “dataMaid: Your Assistant for Documenting Supervised Data Quality Screening in R.” _Journal of Statistical Software_, *90*(6), 1-38. doi: 10.18637/jss.v090.i06 ( doi: 10.18637/jss.v090.i06).
Examples
data(toyData)