| impute_proxy {simputation} | R Documentation |
Impute by variable derivation
Description
Impute missing values by a constant, by copying another variable computing transformations from other variables.
Usage
impute_proxy(dat, formula, add_residual = c("none", "observed", "normal"), ...)
impute_const(dat, formula, add_residual = c("none", "observed", "normal"), ...)
Arguments
dat |
|
formula |
|
add_residual |
|
... |
Currently unused |
Model Specification
Formulas are of the form
IMPUTED_VARIABLES ~ MODEL_SPECIFICATION [ | GROUPING_VARIABLES ]
The left-hand-side of the formula object lists the variable or variables to be imputed.
For impute_const, the MODEL_SPECIFICATION is a single
value and GROUPING_VARIABLES are ignored.
For impute_proxy, the MODEL_SPECIFICATION is a variable or
expression in terms of variables in the dataset that must result in either a
single number of in a vector of length nrow(dat).
If grouping variables are specified, the data set is split according to the values of those variables, and model estimation and imputation occur independently for each group.
Grouping using dplyr::group_by is also supported. If groups are
defined in both the formula and using dplyr::group_by, the data is
grouped by the union of grouping variables. Any missing value in one of the
grouping variables results in an error.
Examples
irisNA <- iris
irisNA[1:3,1] <- irisNA[3:7,2] <- NA
# impute a constant
a <- impute_const(irisNA, Sepal.Width ~ 7)
head(a)
a <- impute_proxy(irisNA, Sepal.Width ~ 7)
head(a)
# copy a value from another variable (where available)
a <- impute_proxy(irisNA, Sepal.Width ~ Sepal.Length)
head(a)
# group mean imputation
a <- impute_proxy(irisNA
, Sepal.Length ~ mean(Sepal.Length,na.rm=TRUE) | Species)
head(a)
# random hot deck imputation
a <- impute_proxy(irisNA, Sepal.Length ~ mean(Sepal.Length, na.rm=TRUE)
, add_residual = "observed")
# ratio imputation (but use impute_lm for that)
a <- impute_proxy(irisNA,
Sepal.Length ~ mean(Sepal.Length,na.rm=TRUE)/mean(Sepal.Width,na.rm=TRUE) * Sepal.Width)