overimpute {micemd} | R Documentation |
Overimputation diagnostic plot
Description
Assess the fit of the predictive distribution after performing multiple imputation with mice
Usage
overimpute(res.mice, plotvars = NULL, plotinds = NULL,
nnodes = 5, path.outfile = NULL, alpha = 0.1)
Arguments
res.mice |
An object of class mids |
plotvars |
column index of the variables overimputed |
plotinds |
row index of the individuals overimputed |
nnodes |
A scalar indicating the number of nodes for parallel calculation. Default value is 5. |
path.outfile |
A vector of strings indicating the path for redirection of print messages. Default value is NULL, meaning that silent imputation is performed. Otherwise, print messages are saved in the files path.outfile/output.txt. One file per node is generated. |
alpha |
alpha level for prediction intervals |
Details
This function imputes each observed values from each of the parameters of the imputation model obtained from the mice procedure. The comparison between the "overimputed" values and the observed values is made by building a confidence interval for each observed value using the quantiles of the overimputed values (Blackwell et al. (2015)). Note that confidence intervals builded with quantiles require a large number of imputations. If the model fits the data well, then the 90% confidence interval should contain the observed value in 90% of the cases (the proportion of intervals containing the observed value is reported in the title of each graph). The function overimpute takes as an input the output of the mice or mice.par function (res.mice), the indices of the incomplete continuous variables that are plotted (plotvars), the indices of individuals (can be useful for time consumming imputation methods), the number of nodes for parallel computation, and the path for exporting print message generated during the parallel process.
Value
A list of two matrices
res.plot |
7-columns matrix that contains (1) the variable which is overimputed, (2) the observed value of the observation, (3) the mean of the overimputations, (4) the lower bound of the confidence interval of the overimputations, (5) the upper bound of the confidence interval of the overimputations, (6) the proportion of the other variables that were missing for that observation in the original data, and (7) the color for graphical representation. |
res.values |
A matrix with overimputed values for each cell. The number of columns corresponds to the number of values generated (i.e. the number of imputed tables) |
Author(s)
Vincent Audigier vincent.audigier@cnam.fr
References
Blackwell, M., Honaker, J. and King. G. 2015. A Unified Approach to Measurement Error and Missing Data: Overview and Applications. Sociological Methods and Research, 1-39. <doi:10.1177/0049124115585360>
See Also
Examples
require(parallel)
nnodes<-detectCores()-1#number of nodes
m<-1000#nb generated values per observation
################
#one level data
################
require(mice)
data(nhanes)
#res.mice<-mice.par(nhanes,m = m,nnodes = nnodes)
#res.over<-overimpute(res.mice, nnodes = nnodes)
################
#two level data (time consumming)
################
data(CHEM97Na)
ind.clust<-1#index for the cluster variable
#initialisation of the argument predictorMatrix
predictor.matrix<-mice(CHEM97Na,m=1,maxit=0)$pred
predictor.matrix[ind.clust,ind.clust]<-0
predictor.matrix[-ind.clust,ind.clust]<- -2
predictor.matrix[predictor.matrix==1]<-2
#initialisation of the argument method
method<-find.defaultMethod(CHEM97Na,ind.clust)
#multiple imputation by chained equations (time consumming)
#res.mice<-mice.par(CHEM97Na,
# predictorMatrix = predictor.matrix,
# method=method,m=m,nnodes = nnodes)
#overimputation on 30 individuals
#res.over<-overimpute(res.mice,
# nnodes=nnodes,
# plotinds=sample(x = seq(nrow(CHEM97Na)),size = 30))