calculateCrossValidation {rSRD}R Documentation

calculateCrossValidation

Description

R interface to test whether the rankings induced by the columns come from the same distribution. If the number of folds and the test method are not specified, the default is the 8-fold Wilcoxon test combined with cross-validation. If the number of rows is less than 8, leave-one-out cross-validation is applied. Columns are ordered based on the SRD values of the different folds, then each consecutive column-pairs are tested. Test statistics for Alpaydin test follows F distribution with df1=2k, df2=k degrees of freedom. Dietterich test statistics follow t-distribution with k degrees of freedom (two-tailed). Wilcoxon test statistics is calculated as the absolute value of the difference of the sum of the positive ranks (W+) and sum of the negative ranks (W-). The distribution for this test statistics can be derived from the Wilcoxon signed rank distribution. For more information about the cross-validation process see Sziklai, Baranyi and Héberger (2021).

Usage

calculateCrossValidation(
  data_matrix,
  method = "Wilcoxon",
  number_of_folds = 8,
  precision = 5,
  output_to_file = TRUE
)

Arguments

data_matrix

A DataFrame.

method

A string specifying the method. The methods "Wilcoxon", "Alpaydin" and "Dietterich" are available.

number_of_folds

The number of folds used in the cross validation. Ranges between 5 to 10.

precision

The precision used for the the ranking matrix transformation.

output_to_file

Boolean flag to enable file output.

Value

A List containing

Author(s)

Balázs R. Sziklai sziklai.balazs@krtk.hu, Linus Olsson linusmeol@gmail.com, Jochen Staudacher jochen.staudacher@hs-kempten.de

References

Sziklai, Balázs R., Máté Baranyi, and Károly Héberger (2021). "Testing Cross-Validation Variants in Ranking Environments", arXiv preprint arXiv:2105.11939 (2021).

Examples

df <- data.frame(
Sol_1=c(7, 6, 5, 4, 3, 2, 1),
Sol_2=c(1, 2, 3, 4, 5, 7, 6),
Sol_3=c(1, 2, 3, 4, 7, 5, 6),
Ref=c(1, 2, 3, 4, 5, 6, 7))

calculateCrossValidation(df, output_to_file = FALSE)

[Package rSRD version 0.1.7 Index]