BaBooN-package {BaBooN}R Documentation

Package for multiple imputation of missing values based on Bayesian Bootstrap with Predictive Mean Matching.


Included are two variants of Bayesian Bootstrap Predictive Mean Matching to multiply impute missing data. The first variant is a variable-by-variable imputation combining sequential regression and Predictive Mean Matching (PMM) that has been extended for unordered categorical data. The Bayesian Bootstrap allows for generating approximately proper multiple imputations. The second variant is also based on PMM, but the focus is on imputing several variables at the same time. The suggestion is to use this variant, if the missing-data pattern resembles a data fusion situation, or any other missing-by-design pattern, where several variables have identical missing-data patterns. Both variants can be run as 'single imputation' versions, in case the analysis objective is of a purely descriptive nature.


Package: BaBooN
Type: Package
Version: 0.2-0
Date: 2015-06-15
License: GPL (>= 2)


Florian Meinfelder [aut, cre] <florian.meinfelder[AT]>
Thorsten Schnapp [aut] <thorsten.schnapp[AT]>
Maintainer: Florian Meinfelder <florian.meinfelder[AT]>


Koller-Meinfelder, F. (2009) Analysis of Incomplete Survey Data – Multiple Imputation Via Bayesian Bootstrap Predictive Mean Matching, doctoral thesis.

Cowles, M.K. and Carlin, B.P. (1996) Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review. Journal of the American Statistical Association, Vol. 91, pp. 883–904.

Eddelbuettel, D. and Francois, R. (2011) Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, Vol. 40, No. 8, pp. 1–18. URL

Eddelbuettel, D. and Sanderson, C. (2014) RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, Vol. 71, March 2014, pp. 1054–1063.

Harrell, F.E., with contributions from Charles Dupont and many others. (2013) Hmisc: Harrell Miscellaneous. R package version 3.13-0.

Little, R.J.A. (1988) Missing-Data Adjustments in Large Surveys, Journal of Business and Economic Statistics, Vol. 6, No. 3, pp. 287-296.

Plummer, M. and Best, N. and Cowles, K. and Vines, K. (2006) CODA: Convergence Diagnosis and Output Analysis for MCMC, R News, Vol. 6, pp. 7–11

Ported to R by Alvaro A. Novo. Original by Joseph L. Schafer <>. (2013). norm: Analysis of multivariate normal datasets with missing values. R package version 1.0-9.5.

R Core Team (2015) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

Raghunathan T.E. and Lepkowski, J.M. and Van Hoewyk, J. and Solenberger, P (2001) A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, Vol. 27, pp. 85–95.

Rubin DB (1981) The Bayesian Bootstrap. The Annals of Statistics, Vol. 9, pp. 130–134.

Rubin, D.B. (1987) Multiple Imputation for Non-Response in Surveys. New York: John Wiley & Sons, Inc.

Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R. New York: Springer.

Van Buuren, S. (2012) Flexible imputation of missing data. Boca Raton: CRC Press.

Van Buuren, S. and Brand, J.P.L. and Groothuis-Oudshoorn, C.G.M. and Rubin, D.B. (2006) Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, Vol. 76, No. 12, pp. 1049–1064.

Van Buuren, S. and Groothuis-Oudshoorn, K. (2011) mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, Vol. 45, No. 3, pp. 1–67. URL

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. New York: Springer.

[Package BaBooN version 0.2-0 Index]