| JointAI {JointAI} | R Documentation |
JointAI: Joint Analysis and Imputation of Incomplete Data
Description
The JointAI package performs simultaneous imputation and inference for incomplete or complete data under the Bayesian framework. Models for incomplete covariates, conditional on other covariates, are specified automatically and modelled jointly with the analysis model. MCMC sampling is performed in 'JAGS' via the R package rjags.
Main functions
JointAI provides the following main functions that facilitate analysis with different models:
-
lm_impfor linear regression -
glm_impfor generalized linear regression -
betareg_impfor regression using a beta distribution -
lognorm_impfor regression using a log-normal distribution -
clm_impfor (ordinal) cumulative logit models -
mlogit_impfor multinomial models -
betamm_impfor mixed models using a beta distribution -
lognormmm_impfor mixed models using a log-normal distribution -
clmm_impfor (ordinal) cumulative logit mixed models -
survreg_impfor parametric (Weibull) survival models -
coxph_impfor (Cox) proportional hazard models -
JM_impfor joint models of longitudinal and survival data
As far as possible, the specification of these functions is analogous to the
specification of widely used functions for the analysis of complete data,
such as
lm, glm,
lme (from the package
nlme),
survreg (from the package
survival) and
coxph (from the package
survival).
Computations can be performed in parallel to reduce computational time,
using the package future,
the argument shrinkage allows the user to impose a penalty on the
regression coefficients of some or all models involved,
and hyper-parameters can be changed via the argument hyperpars.
To obtain summaries of the results, the functions
summary(),
coef() and
confint() are available, and
results can be visualized with the help of
traceplot() or
densplot().
The function predict() allows
prediction (including credible intervals) from JointAI models.
Evaluation and export
Two criteria for evaluation of convergence and precision of the posterior estimate are available:
-
GR_critimplements the Gelman-Rubin criterion ('potential scale reduction factor') for convergence -
MC_errorcalculates the Monte Carlo error to evaluate the precision of the MCMC sample
Imputed data can be extracted (and exported to SPSS) using
get_MIdat().
The function plot_imp_distr() allows
visual comparison of the distribution of observed and imputed values.
Other useful functions
-
parametersandlist_modelsto gain insight in the specified model -
plot_allandmd_patternto visualize the distribution of the data and the missing data pattern
Vignettes
The following vignettes are available
-
Minimal Example:
A minimal example demonstrating the use oflm_imp,summary.JointAI,traceplotanddensplot. -
Visualizing Incomplete Data:
Demonstrations of the options inplot_all(plotting histograms and bar plots for all variables in the data) andmd_pattern(plotting or printing the missing data pattern). -
Model Specification:
Explanation and demonstration of all parameters that are required or optional to specify the model structure inlm_imp,glm_impandlme_imp. Among others, the functionsparameters,list_modelsandset_refcatare used. -
Parameter Selection:
Examples on how to select the parameters/variables/nodes to follow using the argumentmonitor_paramsand the parameters/variables/nodes displayed in thesummary,traceplot,densplotor when usingGR_critorMC_error. -
MCMC Settings:
Examples demonstrating how to set the arguments controlling settings of the MCMC sampling, i.e.,n.adapt,n.iter,n.chains,thin,inits. -
After Fitting:
Examples on the use of functions to be applied after the model has been fitted, includingtraceplot,densplot,summary,GR_crit,MC_error,predict,predDFandget_MIdat. -
Theoretical Background:
Explanation of the statistical method implemented in JointAI.
References
Erler NS, Rizopoulos D, Lesaffre EMEH (2021). "JointAI: Joint Analysis and Imputation of Incomplete Data in R." Journal of Statistical Software, 100(20), 1-56. doi:10.18637/jss.v100.i20.
Erler, N.S., Rizopoulos, D., Rosmalen, J., Jaddoe, V.W.V., Franco, O. H., & Lesaffre, E.M.E.H. (2016). Dealing with missing covariates in epidemiologic studies: A comparison between multiple imputation and a full Bayesian approach. Statistics in Medicine, 35(17), 2955-2974. doi:10.1002/sim.6944
Erler, N.S., Rizopoulos D., Jaddoe, V.W.V., Franco, O.H. & Lesaffre, E.M.E.H. (2019). Bayesian imputation of time-varying covariates in linear mixed models. Statistical Methods in Medical Research, 28(2), 555–568. doi:10.1177/0962280217730851