editrules_package {editrules}R Documentation

An overview of the function of package editrules

Description

Please note: active development has moved to packages 'validate' and 'errorlocate'. Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the 'igraph' package.

NOTE

This package is no longer under active development. The package is superseded by R packages validate for data validation and errorlocate for error localization. We urge new users to use those packages instead.

The editrules package aims to provide an environment to conveniently define, read and check recordwise data constraints including

In literature these constraints, or restrictions are refered to as “edits”. editrules can perform common rule set manipulations like variable elimination and value substitution, and offers error localization functionality based on the (generalized) paradigm of Fellegi and Holt. Under this paradigm, one determines the smallest (weighted) number of variables to adapt such that no (additional or derived) rules are violated. The paradigm is based on the assumption that errors are distributed randomly over the variables and there is no detectable cause of error. It also decouples the detection of corrupt variables from their correction. For some types of error, such as sign flips, typing errors or rounding errors, this assumption does not hold. These errors can be detected and are closely related to their resolution. The reader is referred to the deducorrect package for treating such errors.

I. Define edits

editrules provides several methods for creating edits from a character , expression, data.frame or a text file.

editfile Read conditional numerical, numerical and categorical constraints from textfile
editset Create conditional numerical, numerical and categorical constraints
editmatrix Create a linear constraint matrix for numerical data
editarray Create value combination constraints for categorical data

II. Check and find errors in data

editrules provides several method for checking data.frames with edits

violatedEdits Find out which record violates which edit.
localizeErrors Localize erroneous fields using Fellegi and Holt's principle.
errorLocalizer Low-level error localization function using B&B algorithm

Note that you can call plot, summary and print on results of these functions.

IV. Manipulate and check edits

editrules provides several methods for manipulating edits

substValue Substitute a value in a set of rules
eliminate Derive implied rules by variable elimination
reduce Remove unconstraint variables
isFeasible Check for contradictions
duplicated Find duplicated rules
blocks Decompose rules into independent blocks
disjunct Decouple conditional edits into disjunct edit sets
separate Decompose rules in blocks and decouple conditinal edits
generateEdits Generate all nonredundant implicit edits (editarray only)

V. Plot and coerce edits

editrules provides several methods for plotting and coercion.

editrules.plotting Plot edit-variable connectivity graph
as.igraph Coerce to edit-variable connectivity igraph object
as.character Coerce edits to character representation
as.data.frame Store character representation in data.frame

See Also

Useful links:


[Package editrules version 2.9.5 Index]