ecc {utiml}R Documentation

Ensemble of Classifier Chains for multi-label Classification

Description

Create an Ensemble of Classifier Chains model for multilabel classification.

Usage

ecc(
  mdata,
  base.algorithm = getOption("utiml.base.algorithm", "SVM"),
  m = 10,
  subsample = 0.75,
  attr.space = 0.5,
  replacement = TRUE,
  ...,
  cores = getOption("utiml.cores", 1),
  seed = getOption("utiml.seed", NA)
)

Arguments

mdata

A mldr dataset used to train the binary models.

base.algorithm

A string with the name of the base algorithm. (Default: options("utiml.base.algorithm", "SVM"))

m

The number of Classifier Chains models used in the ensemble. (Default: 10)

subsample

A value between 0.1 and 1 to determine the percentage of training instances that must be used for each classifier. (Default: 0.75)

attr.space

A value between 0.1 and 1 to determine the percentage of attributes that must be used for each classifier. (Default: 0.50)

replacement

Boolean value to define if use sampling with replacement to create the data of the models of the ensemble. (Default: TRUE)

...

Others arguments passed to the base algorithm for all subproblems.

cores

The number of cores to parallelize the training. Values higher than 1 require the parallel package. (Default: options("utiml.cores", 1))

seed

An optional integer used to set the seed. This is useful when the method is run in parallel. (Default: options("utiml.seed", NA))

Details

This model is composed by a set of Classifier Chains models. Classifier Chains is a Binary Relevance transformation method based to predict multi-label data. It is different from BR method due the strategy of extended the attribute space with the 0/1 label relevances of all previous classifiers, forming a classifier chain.

Value

An object of class ECCmodel containing the set of fitted CC models, including:

rounds

The number of interactions

models

A list of BR models.

nrow

The number of instances used in each training dataset

ncol

The number of attributes used in each training dataset

Note

If you want to reproduce the same classification and obtain the same result will be necessary set a flag utiml.mc.set.seed to FALSE.

References

Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333-359.

Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2009). Classifier Chains for Multi-label Classification. Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, 5782, 254-269.

See Also

Other Transformation methods: brplus(), br(), cc(), clr(), dbr(), ebr(), eps(), esl(), homer(), lift(), lp(), mbr(), ns(), ppt(), prudent(), ps(), rakel(), rdbr(), rpc()

Other Ensemble methods: ebr(), eps()

Examples

# Use all default values
model <- ecc(toyml, "RANDOM")
pred <- predict(model, toyml)


# Use C5.0 with 100% of instances and only 5 rounds
model <- ecc(toyml, 'C5.0', m = 5, subsample = 1)

# Use 75% of attributes
model <- ecc(toyml, attr.space = 0.75)

# Running in 2 cores and define a specific seed
model1 <- ecc(toyml, cores=2, seed=123)


[Package utiml version 0.1.7 Index]