mice.impute.smcfcs {miceadds}R Documentation

Substantive Model Compatible Multiple Imputation (Single Level)

Description

Computes substantive model compatible multiple imputation (Bartlett et al., 2015; Bartlett & Morris, 2015). Several regression functions are allowed (see dep_type).

Usage

mice.impute.smcfcs(y, ry, x, wy=NULL, sm, dep_type="norm", sm_type="norm",
       fac_sd_proposal=1, mh_iter=20, ...)

Arguments

y

Incomplete data vector of length n

ry

Vector of missing data pattern (FALSE – missing, TRUE – observed)

x

Matrix (n x p) of complete covariates.

wy

Logical vector indicating positions where imputations should be conducted.

sm

Formula for substantive model.

dep_type

Distribution type for variable which is imputed. Possible choices are "norm" (normal distribution), "lognorm" (lognormal distribution), "yj" (Yeo-Johnson distribution, see mdmb::yjt_regression), "bc" (Box-Cox distribution, see mdmb::bct_regression), "logistic" (logistic distribution).

sm_type

Distribution type for dependent variable in substantive model. One of the distribution mentioned in dep_type can be chosen.

fac_sd_proposal

Starting value for factor of standard deviation in Metropolis-Hastings sampling.

mh_iter

Number iterations in Metropolis-Hasting sampling

...

Further arguments to be passed

Details

Imputed values are drawn based on a Metropolis-Hastings sampling algorithm in which the standard deviation of the proposal distribution is adaptively tuned.

Value

A vector of length nmis=sum(!ry) with imputed values.

References

Bartlett, J. W., & Morris, T. P. (2015). Multiple imputation of covariates by substantive-model compatible fully conditional specification. Stata Journal, 15(2), 437-456.

Bartlett, J. W., Seaman, S. R., White, I. R., Carpenter, J. R., & Alzheimer's Disease Neuroimaging Initiative (2015). Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Statistical Methods in Medical Research, 24(4), 462-487. doi:10.1177/0962280214521348

See Also

See the smcfcs package for an alternative implementation of substantive model multiple imputation in a fully conditional specification approach.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Substantive model with interaction effects
#############################################################################

library(mice)
library(mdmb)

#--- simulate data
set.seed(98)
N <- 1000
x <- stats::rnorm(N)
z <- 0.5*x + stats::rnorm(N, sd=.7)
y <- stats::rnorm(N, mean=.3*x - .2*z + .7*x*z, sd=1 )
dat <- data.frame(x,z,y)
dat[ seq(1,N,3), c("x","y") ] <- NA


#--- define imputation methods
imp <- mice::mice(dat, maxit=0)
method <- imp$method
method["x"] <- "smcfcs"

# define substantive model
sm <- y ~ x*z
# define formulas for imputation models
formulas <- as.list( rep("",ncol(dat)))
names(formulas) <- colnames(dat)
formulas[["x"]] <- x ~ z
formulas[["y"]] <- sm
formulas[["z"]] <- z ~ 1

#- Yeo-Johnson distribution for x
dep_type <- list()
dep_type$x <- "yj"

#-- do imputation
imp <- mice::mice(dat, method=method, sm=sm, formulas=formulas, m=1, maxit=10,
                   dep_type=dep_type)
summary(imp)

#############################################################################
# EXAMPLE 2: Substantive model with quadratic effects
#############################################################################

#** simulate data with missings
set.seed(50)
n <- 1000
x <- stats::rnorm(n)
z <- stats::rnorm(n)
y <- 0.5 * z + x + x^2 + stats::rnorm(n)
mm <- stats::runif(n)
x[sample(1:n, size=370, prob=mm)] <- NA
z[sample(1:n, size=480, prob=mm)] <- NA
y[sample(1:n, size=500, prob=mm)] <- NA

df <- data.frame(x=x,y=y,z=z)

#** imputation
imp <- mice::mice(df, method="smcfcs", sm=y ~ z + x + I(x^2), m=6, maxit=10)
summary(imp)

#** analysis
summary(mice::pool(with(imp, stats::lm(y ~ z + x + I(x^2)))))

#** imputation using the smcfcs package
df$x_sq <- df$x^2
nonmice <- smcfcs::smcfcs(df, smtype="lm", smformula=y ~ z + x + x_sq,
             method=c("norm", "", "norm", "x^2"))
mice::pool(lapply(nonmice$impDatasets, function(x) stats::lm(y ~ z + x + x_sq, data=x)))

## End(Not run)

[Package miceadds version 3.17-44 Index]