| mice.impute.pls {miceadds} | R Documentation | 
Imputation using Partial Least Squares for Dimension Reduction
Description
This function imputes a variable with missing values using PLS regression (Mevik & Wehrens, 2007) for a dimension reduction of the predictor space.
Usage
mice.impute.pls(y, ry, x, type, pls.facs=NULL,
   pls.impMethod="pmm", donors=5, pls.impMethodArgs=NULL, pls.print.progress=TRUE,
   imputationWeights=rep(1, length(y)), pcamaxcols=1E+09,
   min.int.cor=0, min.all.cor=0, N.largest=0, pls.title=NULL, print.dims=TRUE,
   pls.maxcols=5000, use_boot=FALSE, envir_pos=NULL, extract_data=TRUE,
   remove_lindep=TRUE, derived_vars=NULL, ...)
mice.impute.2l.pls2(y, ry, x, type, pls.facs=NULL, pls.impMethod="pmm",
   pls.print.progress=TRUE, imputationWeights=rep(1, length(y)), pcamaxcols=1E+09,
   tricube.pmm.scale=NULL, min.int.cor=0, min.all.cor=0, N.largest=0,
   pls.title=NULL, print.dims=TRUE, pls.maxcols=5000, envir_pos=parent.frame(), ...)
Arguments
| y | Incomplete data vector of length  | 
| ry | Vector of missing data pattern ( | 
| x | Matrix ( | 
| type | 
 
 
 
 
 | 
| pls.facs | Number of factors used in PLS regression. This argument can also be specified as a list defining different numbers of factors for all variables to be imputed. | 
| pls.impMethod | Imputation method used for in PLS estimation.
Any imputation method can be used except if  | 
| donors | Number of donors if predictive mean matching is used
( | 
| pls.impMethodArgs | Arguments for imputation method
 | 
| pls.print.progress | Print progress during PLS regression. | 
| imputationWeights | Vector of sample weights to be used in imputation models. | 
| pcamaxcols | Amount of variance explained by principal components (must be a number between 0 and 1) or number of factors used in PCA (an integer larger than 1). | 
| min.int.cor | Minimum absolute correlation for an interaction of two predictors to be included in the PLS regression model | 
| min.all.cor | Minimum absolute correlation for inclusion in the PLS regression model. | 
| N.largest | Number of variable to be included which do have the largest absolute correlations. | 
| pls.title | Title for progress print in console output. | 
| print.dims | An optional logical indicating whether dimensions of inputs should be printed. | 
| pls.maxcols | Maximum number of interactions to be created. | 
| use_boot | Logical whether Bayesian bootstrap should be used for drawing regression parameters | 
| envir_pos | Position of the environment from which the data should be extracted. | 
| extract_data | Logical indicating whether input data should be extracted
from parent environment within  | 
| remove_lindep | Logical indicating whether linear dependencies should be automatically detected and some predictors are removed | 
| derived_vars | Optional list containing formulas with derived variables for inclusion in PLS dimension reduction | 
| ... | Further arguments to be passed. | 
| tricube.pmm.scale | Scale factor for tricube PMM imputation. | 
Value
A vector of length nmis=sum(!ry) with imputations
if pls.impMethod !="xplsfacs". In case of
pls.impMethod=="xplsfacs" a matrix with PLS factors
is computed.
Note
The mice.impute.2l.pls2 function is just included for reasons of
backward compatibility to former miceadds versions.
References
Mevik, B. H., & Wehrens, R. (2007). The pls package: Principal component and partial least squares regression in R. Journal of Statistical Software, 18, 1-24. doi:10.18637/jss.v018.i02
Examples
## Not run: 
#############################################################################
# EXAMPLE 1: PLS imputation method for internet data
#############################################################################
data(data.internet)
dat <- data.internet
# specify predictor matrix
predictorMatrix <- matrix( 1, ncol(dat), ncol(dat) )
rownames(predictorMatrix) <- colnames(predictorMatrix) <- colnames(dat)
diag( predictorMatrix) <- 0
# use PLS imputation method for all variables
impMethod <- rep( "pls", ncol(dat) )
names(impMethod) <- colnames(dat)
# define predictors for interactions (entries with type 4 in predictorMatrix)
predictorMatrix[c("IN1","IN15","IN16"),c("IN1","IN3","IN10","IN13")] <- 4
# define predictors which should appear as linear and quadratic terms (type 5)
predictorMatrix[c("IN1","IN8","IN9","IN10","IN11"),c("IN1","IN2","IN7","IN5")] <- 5
# use 9 PLS factors for all variables
pls.facs <- as.list( rep( 9, length(impMethod) ) )
names(pls.facs) <- names(impMethod)
pls.facs$IN1 <- 15   # use 15 PLS factors for variable IN1
# choose norm or pmm imputation method
pls.impMethod <- as.list( rep("norm", length(impMethod) ) )
names(pls.impMethod) <- names(impMethod)
pls.impMethod[ c("IN1","IN6")] <- "pmm"
# some arguments for imputation method
pls.impMethodArgs <- list( "IN1"=list( "donors"=10 ),
                           "IN2"=list( "ridge2"=1E-4 ) )
# Model 1: Three parallel chains
imp1 <- mice::mice(data=dat, method=impMethod,
     m=3, maxit=5, predictorMatrix=predictorMatrix,
     pls.facs=pls.facs, # number of PLS factors
     pls.impMethod=pls.impMethod,  # Imputation Method in PLS imputation
     pls.impMethodArgs=pls.impMethodArgs, # arguments for imputation method
     pls.print.progress=TRUE, ls.meth="ridge" )
summary(imp1)
# Model 2: One long chain
imp2 <- miceadds::mice.1chain(data=dat, method=impMethod,
     burnin=10, iter=21, Nimp=3, predictorMatrix=predictorMatrix,
     pls.facs=pls.facs, pls.impMethod=pls.impMethod,
     pls.impMethodArgs=pls.impMethodArgs, ls.meth="ridge" )
summary(imp2)
# Model 3: inclusion of additional derived variables
# define derived variables for IN1
derived_vars <- list( "IN1"=~I( ifelse( IN2>IN3, IN2, IN3 ) ) + I( sin(IN2) ) )
imp3 <- miceadds::mice.1chain(data=dat, method=impMethod, derived_vars=derived_vars,
     burnin=10, iter=21, Nimp=3, predictorMatrix=predictorMatrix,
     pls.facs=pls.facs, pls.impMethod=pls.impMethod,
     pls.impMethodArgs=pls.impMethodArgs, ls.meth="ridge" )
summary(imp3)
#*** example for using imputation function at the level of a variable
# extract first imputed dataset
imp1 <- mice::complete(imp1, action=1)
data_imp1[ is.na(dat$IN1), "IN1" ] <- NA
# define variables
y <- data_imp1$IN1
x <- data_imp1[, -1 ]
ry <- ! is.na(y)
cn <- colnames(dat)
p <- ncol(dat)
type <- rep(1,p)
names(type) <- cn
type["IN1"] <- 0
# imputation of variable 'IN1'
imp0 <- miceadds::mice.impute.pls(y=y, x=x, ry=ry, type=type, pls.facs=10, pls.impMethod="norm",
             ls.meth="ridge", extract_data=FALSE )
## End(Not run)