| impute_unsupervised {imputeGeneric} | R Documentation | 
Unsupervised imputation
Description
Impute a data set with an unsupervised inner method. This function is one
main function which can be used inside of impute_iterative(). If you need
pre-imputation or iterations, directly use impute_iterative().
Usage
impute_unsupervised(
  ds,
  model_fun,
  predict_fun,
  rows_used_for_imputation = "only_complete",
  rows_order = seq_len(nrow(ds)),
  update_model = "every_iteration",
  update_ds_model = "every_iteration",
  model_arg = NULL,
  M = is.na(ds),
  ...
)
Arguments
| ds | The data set to be imputed. Must be a data frame with column names. | 
| model_fun | An unsupervised model function which take as arguments
 | 
| predict_fun | A predict function which uses the via  | 
| rows_used_for_imputation | Which rows should be used to impute other rows? Possible choices: "only_complete", "already_imputed", "all_except_i", "all" | 
| rows_order | Ordering of the rows for imputation. This can be a vector
with indices or an  | 
| update_model | How often should the model for imputation be updated? Possible choices are: "everytime" (after every imputed value) and "every_iteration" (only one model is created and used for all missing values). | 
| update_ds_model | How often should the data set for the inner model be updated? Possible choices are: "everytime" (after every imputed value), and "every_iteration". | 
| model_arg | Further arguments for  | 
| M | Missing data indicator matrix | 
| ... | Further arguments given to  | 
Details
This function imputes the rows of the data set ds row by
row. The imputation order of the rows can be specified by rows_order.
Furthermore, rows_used_for_imputation controls which rows are used for
the imputation. If ds is pre-imputed, the missing data indicator matrix
can be supplied via M.
The inner method used to impute the data set can be defined with model_fun.
This model_fun must take a data set, the missing data indicator matrix M,
the index i of the row which should be imputed right now (which is NULL,
if the model is updated only once per iteration or only uses complete rows)
and model_arg in this order. It must return a model model_imp which is
given to predict_fun to generate imputation values for the missing values
in a row i. The model_fun and predict_fun can be self-written or a
predefined one (see below) can be used.
If update_model = "every_iteration" only one model is fitted and the
argument update_ds_model is ignored. This option can be considerably
faster than update_model = "everytime", especially, for data sets with
many rows with missing values. However, some methods (like nearest
neighbors) need update_model = "everytime".
Value
The imputed data set.
See Also
model_donor() and predict_donor() for a pair of predefined
functions for model_fun and predict_fun.
Examples
ds_mis <- missMethods::delete_MCAR(
  data.frame(X = rnorm(20), Y = rnorm(20)), 0.2, 1
)
impute_unsupervised(ds_mis, model_donor, predict_donor)
# knn imputation with k = 2
impute_unsupervised(ds_mis, model_donor, predict_donor,
  update_model = "everytime", model_arg = list(k = 2)
)