topPerformers {performanceEstimation}R Documentation

Obtain the best scores from a performance estimation experiment

Description

This function can be used to obtain the names of the workflows that obtained the best scores (the top performers) on an experimental comparison. This information will be shown for each of the evaluation metrics involved in the comparison and also for all predictive tasks that were used.

Usage

topPerformers(compRes,
           maxs=rep(FALSE,dim(compRes[[1]][[1]]@iterationsScores)[2]),
           stat="avg",digs=3)

Arguments

compRes

A ComparisonResults object with the results of your experimental comparison.

maxs

A vector of booleans with as many elements are there are metrics estimated in the experimental comparison. A TRUE value means the respective statistic is to be maximized, while a FALSE means minimization. Defaults to all FALSE values, i.e. all metrics are to be minimized.

stat

The statistic to be used to obtain the ranks. The options are the statistics produced by the function summary applied to objects of class ComparisonResults, i.e. "avg", "std", "med", "iqr", "min", "max" or "invalid" (defaults to "avg").

digs

The number of digits (defaults to 3) used in the scores column of the results.

Details

This is an utility function to check which were the top performers in a comparative experiment for each data set and each evaluation metric. The notion of best performance depends on the type of evaluation metric, thus the need for the second argument. Some evaluation statistics are to be maximized (e.g. accuracy), while others are to be minimized (e.g. mean squared error). If you have a mix of these types on your experiment then you can use the maxs parameter to inform the function of which are to be maximized and minimized.

Value

The function returns a list with named components. The components correspond to the predictive tasks used in the experimental comparison. For each component you get a data.frame, where the rows represent the statistics. For each statistic you get the name of the top performer (1st column of the data frame) and the respective score on that statistic (2nd column).

Author(s)

Luis Torgo ltorgo@dcc.fc.up.pt

References

Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436

See Also

performanceEstimation, topPerformer, rankWorkflows, metricsSummary

Examples

## Not run: 
## Estimating several evaluation metrics on different variants of a
## regression tree and of a SVM, on  two data sets, using one repetition
## of  10-fold CV

data(swiss)
data(mtcars)
library(e1071)

## run the experimental comparison
results <- performanceEstimation(
               c(PredTask(Infant.Mortality ~ ., swiss),
                 PredTask(mpg ~ ., mtcars)),
               c(workflowVariants(learner='svm',
                                  learner.pars=list(cost=c(1,5),gamma=c(0.1,0.01))
                                 )
               ),
               EstimationTask(metrics=c("mse","mae"),method=CV(nReps=2,nFolds=5))
                                 )
## get the top performers for each task and evaluation metric
topPerformers(results)

## End(Not run)

[Package performanceEstimation version 1.1.0 Index]