| mlr_pipeops_imputelearner {mlr3pipelines} | R Documentation |
Impute Features by Fitting a Learner
Description
Impute features by fitting a Learner for each feature.
Uses the features indicated by the context_columns parameter as features to train the imputation Learner.
Note this parameter is part of the PipeOpImpute base class and explained there.
Additionally, only features supported by the learner can be imputed; i.e. learners of type
regr can only impute features of type integer and numeric, while classif can impute
features of type factor, ordered and logical.
The Learner used for imputation is trained on all context_columns; if these contain missing values,
the Learner typically either needs to be able to handle missing values itself, or needs to do its
own imputation (see examples).
Format
R6Class object inheriting from PipeOpImpute/PipeOp.
Construction
PipeOpImputeLearner$new(learner, id = NULL, param_vals = list())
-
id::character(1)
Identifier of resulting object, default"impute.", followed by theidof theLearner. -
learner::Learner|character(1)Learnerto wrap, or a string identifying aLearnerin themlr3::mlr_learnersDictionary. TheLearnerusually needs to be able to handle missing values, i.e. have themissingsproperty, unless care is taken thatcontext_columnsdo not contain missings; see examples.
This argument is always cloned; to access theLearnerinsidePipeOpImputeLearnerby-reference, use$learner.
-
param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
Input and output channels are inherited from PipeOpImpute.
The output is the input Task with missing values from all affected features imputed by the trained model.
State
The $state is a named list with the $state elements inherited from PipeOpImpute.
The $state$models is a named list of models created by the Learner's $.train() function
for each column. If a column consists of missing values only during training, the model is 0 or the levels of the
feature; these are used for sampling during prediction.
This state is given the class "pipeop_impute_learner_state".
Parameters
The parameters are the parameters inherited from PipeOpImpute, in addition to the parameters of the Learner
used for imputation.
Internals
Uses the $train and $predict functions of the provided learner. Features that are entirely NA are imputed as 0
or randomly sampled from available (factor / logical) levels.
The Learner does not necessarily need to handle missing values in cases
where context_columns is chosen well (or there is only one column with missing values present).
Fields
Fields inherited from PipeOpTaskPreproc/PipeOp, as well as:
-
learner::Learner
Learnerthat is being wrapped. Read-only. -
learner_models::listofLearner|NULL
Learnerthat is being wrapped. This list is named by features for which aLearnerwas fitted, and contains the sameLearner, but with different respective models for each feature. If thisPipeOpis not trained, this is an emptylist. For features that were entirelyNAduring training, thelistcontainsNULLelements.
Methods
Only methods inherited from PipeOpImpute/PipeOp.
See Also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Other Imputation PipeOps:
PipeOpImpute,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample
Examples
library("mlr3")
task = tsk("pima")
task$missings()
po = po("imputelearner", lrn("regr.rpart"))
new_task = po$train(list(task = task))[[1]]
new_task$missings()
# '$state' of the "regr.rpart" Learner, trained to predict the 'mass' column:
po$state$model$mass
library("mlr3learners")
# to use the "regr.kknn" Learner, prefix it with its own imputation method!
# The "imputehist" PipeOp is used to train "regr.kknn"; predictions of this
# trained Learner are then used to impute the missing values in the Task.
po = po("imputelearner",
po("imputehist") %>>% lrn("regr.kknn")
)
new_task = po$train(list(task = task))[[1]]
new_task$missings()