makeCPOCase {mlrCPO}R Documentation

Build Data-Dependent CPOs

Description

This is a CPOConstructor to be used to create a CPO. It is called like any R function and returns the created CPO.

The meta CPO which determines what CPO to apply to a data depending on a provided function. Many parameters coincide with the parameters of makeCPO, it is suggested to read the relevant parameter description there.

makeCPOCase creates a CPOConstructor, while cpoCase can be used as CPOConstructor itself.

Usage

makeCPOCase(
  par.set = makeParamSet(),
  par.vals = list(),
  export.cpos = list(),
  dataformat = c("df.features", "split", "df.all", "task", "factor", "ordered",
    "numeric"),
  dataformat.factor.with.ordered = TRUE,
  properties.data = NULL,
  properties.adding = NULL,
  properties.needed = NULL,
  properties.target = NULL,
  cpo.build
)

cpoCase(
  par.set = makeParamSet(),
  par.vals = list(),
  export.cpos = list(),
  dataformat = c("df.features", "split", "df.all", "task", "factor", "ordered",
    "numeric"),
  dataformat.factor.with.ordered = TRUE,
  properties.data = NULL,
  properties.adding = NULL,
  properties.needed = NULL,
  properties.target = NULL,
  cpo.build,
  id,
  export = "export.default",
  affect.type = NULL,
  affect.index = integer(0),
  affect.names = character(0),
  affect.pattern = NULL,
  affect.invert = FALSE,
  affect.pattern.ignore.case = FALSE,
  affect.pattern.perl = FALSE,
  affect.pattern.fixed = FALSE
)

Arguments

par.set

[ParamSet]
Parameters (additionally to the exported CPOs) of the CPO. Default is the empty ParamSet.

par.vals

[list]
Named list of default parameter values for the CPO. These are used additionally to the parameter default values of par.set. It is often more elegant to use these default values, and not par.vals. Default is list(). Default is list().

export.cpos

[list of CPO]
List of CPO objects that have their hyperparameters exported. If this is a named list, the names must be unique and represent the parameter name by which they are given to the cpo.build function. They are also the IDs that will be given to the CPOs upon construction. If the list is not named, the IDs (or default names, in case of CPOConstructors), are used instead, and need to be unique.

All CPOs in the list must either be all Feature Operation CPOs, all Target Operation CPOs performing the same conversion, or all Retrafoless CPOs.

The cpo.build function needs to have an argument for each of the names in the list. The CPO objects are pre-configured by the framework to have the hyperparameter settings as set by the ones exported by cpoCase. Default is list().

dataformat

[character(1)]
Indicate what format the data should be as seen by “cpo.build”. See the parameter in makeCPO for details.

Note that if the CPOs in export.cpos are Retrafoless CPOs, this must be either “task” or “df.all”. Default is “df.features”.

dataformat.factor.with.ordered

[logical(1)]
Whether to treat ordered typed features as factor typed features. See the parameter in makeCPO. Default is TRUE.

properties.data

[character]
See the parameter in makeCPO.

The properties of the resulting CPO are calculated from the constituent CPOs automatically in the most lenient way. If this parameter is not NULL, the calculated the given properties are used instead of the calculated properties.

Default is NULL.

properties.adding

[character]
See the parameter in makeCPO.

The properties of the resulting CPO are calculated from the constituent CPOs automatically in the most lenient way. If this parameter is not NULL, the calculated the given properties are used instead of the calculated properties.

Default is NULL.

properties.needed

[character]
See the parameter in makeCPO.

The properties of the resulting CPO are calculated from the constituent CPOs automatically in the most lenient way. If this parameter is not NULL, the calculated the given properties are used instead of the calculated properties.

Default is NULL.

properties.target

[character]
See the parameter in makeCPO.

The properties of the resulting CPO are calculated from the constituent CPOs automatically in the most lenient way. If this parameter is not NULL, the calculated the given properties are used instead of the calculated properties.

Default is NULL.

cpo.build

[function]
This function works similar to cpo.trafo in makeCPO: It has the arguments data, target, one argument for each hyperparameter declared in par.set. However, it also has one parameter for each entry in export.cpos, named by each item in that list. The cpoCase framework supplies the pre-configured CPOs (pre-configured as the exported hyperparameters of cpoCase demand) to the cpo.build code via these parameters. The return value of cpo.build must be a CPO, which will then be used on the data.

Just as cpo.trafo in makeCPO, this can also be a ‘headless’ function; it then must be written as an expression, starting with a {.

id

[character(1)]
id to use as prefix for the CPO's hyperparameters. this must be used to avoid name clashes when composing two CPOs of the same type, or with learners or other CPOS with hyperparameters with clashing names.

export

[character]
Either a character vector indicating the parameters to export as hyperparameters, or one of the special values “export.all” (export all parameters), “export.default” (export all parameters that are exported by default), “export.set” (export all parameters that were set during construction), “export.default.set” (export the intersection of the “default” and “set” parameters), “export.unset” (export all parameters that were not set during construction) or “export.default.unset” (export the intersection of the “default” and “unset” parameters). Default is “export.default”.

affect.type

[character | NULL]
Type of columns to affect. A subset of “numeric”, “factor”, “ordered”, “other”, or NULL to not match by column type. Default is NULL.

affect.index

[numeric]
Indices of feature columns to affect. The order of indices given is respected. Target column indices are not counted (since target columns are always included). Default is integer(0).

affect.names

[character]
Feature names of feature columns to affect. The order of names given is respected. Default is character(0).

affect.pattern

[character(1) | NULL]
grep pattern to match feature names by. Default is NULL (no pattern matching)

affect.invert

[logical(1)]
Whether to affect all features not matched by other affect.* parameters.

affect.pattern.ignore.case

[logical(1)]
Ignore case when matching features with affect.pattern; see grep. Default is FALSE.

affect.pattern.perl

[logical(1)]
Use Perl-style regular expressions for affect.pattern; see grep. Default is FALSE.

affect.pattern.fixed

[logical(1)]
Use fixed matching instead of regular expressions for affect.pattern; see grep. Default is FALSE.

Value

[CPO].

General CPO info

This function creates a CPO object, which can be applied to Tasks, data.frames, link{Learner}s and other CPO objects using the %>>% operator.

The parameters of this object can be changed after creation using the function setHyperPars. The other hyper-parameter manipulating functins, getHyperPars and getParamSet similarly work as one expects.

If the “id” parameter is given, the hyperparameters will have this id as aprefix; this will, however, not change the parameters of the creator function.

Calling a CPOConstructor

CPO constructor functions are called with optional values of parameters, and additional “special” optional values. The special optional values are the id parameter, and the affect.* parameters. The affect.* parameters enable the user to control which subset of a given dataset is affected. If no affect.* parameters are given, all data features are affected by default.

See Also

Other CPOs: cpoApplyFunRegrTarget(), cpoApplyFun(), cpoAsNumeric(), cpoCache(), cpoCbind(), cpoCollapseFact(), cpoDropConstants(), cpoDropMostlyConstants(), cpoDummyEncode(), cpoFilterAnova(), cpoFilterCarscore(), cpoFilterChiSquared(), cpoFilterFeatures(), cpoFilterGainRatio(), cpoFilterInformationGain(), cpoFilterKruskal(), cpoFilterLinearCorrelation(), cpoFilterMrmr(), cpoFilterOneR(), cpoFilterPermutationImportance(), cpoFilterRankCorrelation(), cpoFilterRelief(), cpoFilterRfCImportance(), cpoFilterRfImportance(), cpoFilterRfSRCImportance(), cpoFilterRfSRCMinDepth(), cpoFilterSymmetricalUncertainty(), cpoFilterUnivariate(), cpoFilterVariance(), cpoFixFactors(), cpoIca(), cpoImpactEncodeClassif(), cpoImpactEncodeRegr(), cpoImputeConstant(), cpoImputeHist(), cpoImputeLearner(), cpoImputeMax(), cpoImputeMean(), cpoImputeMedian(), cpoImputeMin(), cpoImputeMode(), cpoImputeNormal(), cpoImputeUniform(), cpoImpute(), cpoLogTrafoRegr(), cpoMakeCols(), cpoMissingIndicators(), cpoModelMatrix(), cpoOversample(), cpoPca(), cpoProbEncode(), cpoQuantileBinNumerics(), cpoRegrResiduals(), cpoResponseFromSE(), cpoSample(), cpoScaleMaxAbs(), cpoScaleRange(), cpoScale(), cpoSelect(), cpoSmote(), cpoSpatialSign(), cpoTransformParams(), cpoWrap(), makeCPOMultiplex()

Other special CPOs: cpoCbind(), cpoTransformParams(), cpoWrap(), makeCPOMultiplex()


[Package mlrCPO version 0.3.7-7 Index]