R: Fit classifiers using time-series features using a...

classify {theftdlc}

R Documentation

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Description

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Usage

classify(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)

tsfeature_classifier(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)

Arguments

`data`	`feature_calculations` object containing the raw feature matrix produced by `theft::calculate_features`
`classifier`	`function` specifying the classifier to fit. Should be a function with 2 arguments: `formula` and `data` containing a classifier compatible with R's `predict` functionality. Please note that `classify` z-scores data prior to modelling using the train set's information so disabling default scaling if your function uses it is recommended. Defaults to `NULL` which means the following linear SVM is fit: `classifier = function(formula, data){mod <- e1071::svm(formula, data = data, kernel = "linear", scale = FALSE, probability = TRUE)}`
`train_size`	`numeric` denoting the proportion of samples to use in the training set. Defaults to `0.75`
`n_resamples`	`integer` denoting the number of resamples to calculate. Defaults to `30`
`by_set`	`Boolean` specifying whether to compute classifiers for each feature set. Defaults to `TRUE`. If `FALSE`, the function will instead find the best individually-performing features
`use_null`	`Boolean` whether to fit null models where class labels are shuffled in order to generate a null distribution that can be compared to performance on correct class labels. Defaults to `FALSE`
`seed`	`integer` to fix R's random number generator to ensure reproducibility. Defaults to `123`

Value

list containing a named vector of train-test set sizes, and a data.frame of classification performance results

Author(s)

Trent Henderson

Examples


library(theft)

features <- theft::calculate_features(theft::simData,
  group_var = "process",
  feature_set = "catch22")

classifiers <- classify(features,
  by_set = FALSE,
  n_resamples = 3)

[Package theftdlc version 0.1.0 Index]