prop_usable_cases {missForestPredict} | R Documentation |
Calculates variable-wise proportion of usable cases (missing and observed)
Description
Calculates variable-wise proportion of usable cases (missing and observed) as in Molenberghs et al. (2014).
Usage
prop_usable_cases(data)
Arguments
data |
dataframe to be imputed |
Details
missForest builds models for each variable using the observed values of that variable as outcome of a random forest model. It then imputes the missing part of the variable using the learned models.
If all values of a predictor are missing among the observed value of the outcome,
the value of p_obs
will be 1 and the model built will rely heavily on the initialized values.
If all values of a predictor are observed among the observed values of the outcome, p_obs
will be 0
and the model will rely on observed values. Low values of p_obs
are preferred.
Similarly, if all values of a predictor are missing among the missing values of the outcome,
p_miss
will have a value of 0 and the imputations (predictions) will heavily rely on the initialized values.
If all values of a predictor are observed among the missing value of the outcome, p_miss
will have a value of 1
and the imputations (predictions) will rely on real values. High values of p_miss
are preferred.
Each row represents a variable to be imputed and each column a predictor.
Value
a list with two elements: p_obs
and p_miss
p_obs |
the proportion of missing |
p_miss |
the proportion of observed |
References
Molenberghs, G., Fitzmaurice, G., Kenward, M. G., Tsiatis, A., & Verbeke, G. (Eds.). (2014). Handbook of missing data methodology. CRC Press. Chapter "Multiple Imputation"