CDdiagram.BD {performanceEstimation}R Documentation

CD diagrams for the post-hoc Boferroni-Dunn test

Description

This function obtains a Critical Difference (CD) diagram for the post-hoc Bonferroni-Dunn test along the lines defined by Demsar (2006). These diagrams provide an interesting visualization of the statistical significance of the observed paired differences between a set of workflows and a selected baseline wrokflow. They allow us to compare a set of alternative workflows against this baseline and answer the question whether the differences are statisticall y significant.

Usage

CDdiagram.BD(r, metric = names(r)[1])

Arguments

r

A list resulting from a call to pairedComparisons

metric

The metric for which the CD diagram will be obtained (defaults to the first metric of the comparison).

Details

Critical Difference (CD) diagrams are interesting sucint visualizations of the results of a Bonferroni-Dunn post-hoc test that is designed to check the statistical significance between the differences in average rank of a set of workflows against a baseline workflow, on a set of predictive tasks.

In the resulting graph each workflow is represented by a colored line. The X axis where the lines end represents the average rank position of the respective workflow across all tasks. The null hypothesis is that the average rank of a baseline workflow does not differ with statistical significance (at some confidence level defined in the call to pairedComparisons that creates the object used to obtain these graphs) from the average ranks of a set of alternative workflows. An horizontal line connects the baseline workflow with the alternative workflows for which we cannot reject this hypothesis. This means that only the alternative workflows that are not connect with the baseline can be considered as having an average rank that is different from the one of the baseline with statistical significance. To help spotting these differences the name of the baseline workflow is shown in bold, and the names of the alternative workflows whose difference is significant are shown in italics.

Value

Nothing, the graph is draw on the current device.

Author(s)

Luis Torgo ltorgo@dcc.fc.up.pt

References

Demsar, J. (2006) Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7, 1-30.

Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436

See Also

CDdiagram.Nemenyi, CDdiagram.BD, signifDiffs, metricNames, performanceEstimation, topPerformers, topPerformer, rankWorkflows, metricsSummary, ComparisonResults

Examples

## Not run: 
## Estimating MSE for 3 variants of both
## regression trees and SVMs, on  two data sets, using one repetition
## of 10-fold CV
library(e1071)
data(iris)
data(Satellite,package="mlbench")
data(LetterRecognition,package="mlbench")


## running the estimation experiment
res <- performanceEstimation(
           c(PredTask(Species ~ .,iris),
             PredTask(classes ~ .,Satellite,"sat"),
             PredTask(lettr ~ .,LetterRecognition,"letter")),
           workflowVariants(learner="svm",
                 learner.pars=list(cost=1:4,gamma=c(0.1,0.01))),
           EstimationTask(metrics=c("err","acc"),method=CV()))


## checking the top performers
topPerformers(res)

## now let us assume that we will choose "svm.v2" as our baseline
## carry out the paired comparisons
pres <- pairedComparisons(res,"svm.v2")

## obtaining a CD diagram comparing all workflows against
## the baseline (defined in the previous call to pairedComparisons)
CDdiagram.BD(pres,metric="err")


## End(Not run)

[Package performanceEstimation version 1.1.0 Index]