mice.impute.lasso.select.norm {mice} | R Documentation |
Imputation by indirect use of lasso linear regression
Description
Imputes univariate missing data using Bayesian linear regression following a preprocessing lasso variable selection step.
Usage
mice.impute.lasso.select.norm(y, ry, x, wy = NULL, nfolds = 10, ...)
Arguments
y |
Vector to be imputed |
ry |
Logical vector of length |
x |
Numeric design matrix with |
wy |
Logical vector of length |
nfolds |
The number of folds for the cross-validation of the lasso penalty. The default is 10. |
... |
Other named arguments. |
Details
The method consists of the following steps:
For a given
y
variable under imputation, fit a linear regression with lasso penalty usingy[ry]
as dependent variable andx[ry, ]
as predictors. Coefficients that are not shrunk to 0 define an active set of predictors that will be used for imputationDefine a Bayesian linear model using
y[ry]
as the dependent variable, the active set ofx[ry, ]
as predictors, and standard non-informative priorsDraw parameter values for the intercept, regression weights, and error variance from their posterior distribution
Draw imputations from the posterior predictive distribution
The user can specify a predictorMatrix
in the mice
call
to define which predictors are provided to this univariate imputation method.
The lasso regularization will select, among the variables indicated by
the user, the ones that are important for imputation at any given iteration.
Therefore, users may force the exclusion of a predictor from a given
imputation model by specifying a 0
entry.
However, a non-zero entry does not guarantee the variable will be used,
as this decision is ultimately made by the lasso variable selection
procedure.
The method is based on the Indirect Use of Regularized Regression (IURR) proposed by Zhao & Long (2016) and Deng et al (2016).
Value
Vector with imputed data, same type as y
, and of length
sum(wy)
Author(s)
Edoardo Costantini, 2021
References
Deng, Y., Chang, C., Ido, M. S., & Long, Q. (2016). Multiple imputation for general missing data patterns in the presence of high-dimensional data. Scientific reports, 6(1), 1-10.
Zhao, Y., & Long, Q. (2016). Multiple imputation in the presence of high-dimensional data. Statistical Methods in Medical Research, 25(5), 2021-2035.
See Also
Other univariate imputation functions:
mice.impute.cart()
,
mice.impute.lasso.logreg()
,
mice.impute.lasso.norm()
,
mice.impute.lasso.select.logreg()
,
mice.impute.lda()
,
mice.impute.logreg.boot()
,
mice.impute.logreg()
,
mice.impute.mean()
,
mice.impute.midastouch()
,
mice.impute.mnar.logreg()
,
mice.impute.mpmm()
,
mice.impute.norm.boot()
,
mice.impute.norm.nob()
,
mice.impute.norm.predict()
,
mice.impute.norm()
,
mice.impute.pmm()
,
mice.impute.polr()
,
mice.impute.polyreg()
,
mice.impute.quadratic()
,
mice.impute.rf()
,
mice.impute.ri()