sense {sense}R Documentation

sense

Description

Stacked ensamble for regression tasks based on 'mlr3' framework.

Usage

sense(
  df,
  target_feat,
  benchmarking = "all",
  super = "avg",
  algos = c("glmnet", "ranger", "xgboost", "rpart", "kknn", "svm"),
  sampling_rate = 1,
  metric = "mae",
  collapse_char_to = 10,
  num_preproc = "scale",
  fct_preproc = "one-hot",
  impute_num = "sample",
  missing_fusion = FALSE,
  inner = "holdout",
  outer = "holdout",
  folds = 3,
  repeats = 3,
  ratio = 0.5,
  selected_filter = "information_gain",
  selected_n_feats = NULL,
  tuning = "random_search",
  budget = 30,
  resolution = 5,
  n_evals = 30,
  minute_time = 10,
  patience = 0.3,
  min_improve = 0.01,
  java_mem = 64,
  decimals = 2,
  seed = 42
)

Arguments

df

A data frame with features and target.

target_feat

String. Name of the numeric feature for the regression task.

benchmarking

Positive integer. Number of base learners to stack. Default: "all".

super

String. Super learner of choice among the available learners. Default: "avg".

algos

String vector. Available learners are: "glmnet", "ranger", "xgboost", "rpart", "kknn", "svm".

sampling_rate

Positive numeric. Sampling rate before applying the stacked ensemble. Default: 1.

metric

String. Evaluation metric for outer and inner cross-validation. Default: "mae".

collapse_char_to

Positive integer. Conversion of characters to factors with predefined maximum number of levels. Default: 10.

num_preproc

String. Options for scalar pre-processing: "scale" or "range". Default: "scale".

fct_preproc

String. Options for factor pre-processing: "encodeimpact", "encodelmer", "one-hot", "treatment", "poly", "sum", "helmert". Default: "one-hot".

impute_num

String. Options for missing imputation in case of numeric: "sample" or "hist". Default: "sample". For factor the default mode is Out-Of-Range.

missing_fusion

String. Adding missing indicator features. Default: "FALSE".

inner

String. Cross-validation inner cycle: "holdout", "cv", "repeated_cv", "subsampling". Default: "holdout".

outer

String. Cross-validation outer cycle: "holdout", "cv", "repeated_cv", "subsampling". Default: "holdout".

folds

Positive integer. Number of repetitions used in "cv" and "repeated_cv". Default: 3.

repeats

Positive integer. Number of repetitions used in "subsampling" and "repeated_cv". Default: 3.

ratio

Positive numeric. Percentage value for "holdout" and "subsampling". Default: 0.5.

selected_filter

String. Filters available for regression tasks: "carscore", "cmim", "correlation", "find_correlation", "information_gain", "relief", "variance". Default: "information_gain".

selected_n_feats

Positive integer. Number of features to select through the chosen filter. Default: NULL.

tuning

String. Available options are "random_search" and "grid_search". Default: "random_search".

budget

Positive integer. Maximum number of trials during random search. Default: 30.

resolution

Positive integer. Grid resolution for each hyper-parameter. Default: 5.

n_evals

Positive integer. Number of evaluation for termination. Default: 30.

minute_time

Positive integer. Maximum run time before termination. Default: 10.

patience

Positive numeric. Percentage of stagnating evaluations before termination. Default: 0.3.

min_improve

Positive numeric. Minimum error improvement required before termination. Default: 0.01.

java_mem

Positive integer. Memory allocated to Java. Default: 64.

decimals

Positive integer. Decimal format of prediction. Default: 2.

seed

Positive integer. Default: 42.

Value

This function returns a list including:

Author(s)

Giancarlo Vercellino giancarlo.vercellino@gmail.com

See Also

Useful links:

Examples

## Not run: 
sense(benchmark, "y", algos = c("glmnet", "rpart"))


## End(Not run)


[Package sense version 1.1.0 Index]