validate {validate} | R Documentation |
Data Validation Infrastructure
Description
Data often suffer from errors and missing values. A necessary step before data
analysis is verifying and validating your data. Package validate
is a
toolbox for creating validation rules and checking data against these rules.
Getting started
The easiest way to get started is through the examples given in check_that
.
The general workflow in validate
follows the following pattern.
Define a set of rules or quality indicator using
validator
orindicator
.-
confront
data with the rules or indicators, Examine the results either graphically or by summary.
There are several convenience functions that allow one to define rules from the commandline, through a (freeform or yaml) file and to investigate and maintain the rules themselves. Please have a look at the cookbook for a comprehensive introduction.
Author(s)
Maintainer: Mark van der Loo mark.vanderloo@gmail.com (ORCID)
Authors:
Edwin de Jonge (ORCID)
Other contributors:
Paul Hsieh [contributor]
References
An overview of this package, its underlying ideas and many examples can be found in MPJ van der Loo and E. de Jonge (2018) Statistical data cleaning with applications in R John Wiley & Sons.
Please use citation("validate")
to get a citation for (scientific)
publications.
See Also
Useful links:
Report bugs at https://github.com/data-cleaning/validate/issues