R: Conduct statistical testing on time-series feature...

compare_features {theftdlc}

R Documentation

Conduct statistical testing on time-series feature classification performance to identify top features or compare entire sets

Description

Conduct statistical testing on time-series feature classification performance to identify top features or compare entire sets

Usage

compare_features(
  data,
  metric = c("accuracy", "precision", "recall", "f1"),
  by_set = TRUE,
  hypothesis = c("null", "pairwise"),
  p_adj = c("none", "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr")
)

Arguments

`data`	`list` object containing the classification outputs produce by `tsfeature_classifier`
`metric`	`character` denoting the classification performance metric to use in statistical testing. Can be one of `"accuracy"`, `"precision"`, `"recall"`, `"f1"`. Defaults to `"accuracy"`
`by_set`	`Boolean` specifying whether you want to compare feature sets (if `TRUE`) or individual features (if `FALSE`). Defaults to `TRUE` but this is contingent on whether you computed by set or not in `tsfeature_classifier`
`hypothesis`	`character` denoting whether p-values should be calculated for each feature set or feature (depending on `by_set` argument) individually relative to the null if `use_null = TRUE` in `tsfeature_classifier` through `"null"`, or whether pairwise comparisons between each set or feature should be conducted on main model fits only through `"pairwise"`. Defaults to `"null"`
`p_adj`	`character` denoting the adjustment made to p-values for multiple comparisons. Should be a valid argument to `stats::p.adjust`. Defaults to `"none"` for no adjustment. `"holm"` is recommended as a starting point for adjustments

Value

data.frame containing the results

Author(s)

Trent Henderson

References

Henderson, T., Bryant, A. G., and Fulcher, B. D. Never a Dull Moment: Distributional Properties as a Baseline for Time-Series Classification. 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, (2023).

Examples


library(theft)

features <- theft::calculate_features(theft::simData,
  group_var = "process",
  feature_set = NULL,
  features = list("mean" = mean, "sd" = sd))

classifiers <- classify(features,
                        by_set = FALSE,
                        n_resamples = 3)

compare_features(classifiers,
                 by_set = FALSE,
                 hypothesis = "pairwise")

[Package theftdlc version 0.1.0 Index]