classify {theftdlc}R Documentation

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Description

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Usage

classify(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)

tsfeature_classifier(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)

Arguments

data

feature_calculations object containing the raw feature matrix produced by theft::calculate_features

classifier

function specifying the classifier to fit. Should be a function with 2 arguments: formula and data containing a classifier compatible with R's predict functionality. Please note that classify z-scores data prior to modelling using the train set's information so disabling default scaling if your function uses it is recommended. Defaults to NULL which means the following linear SVM is fit: classifier = function(formula, data){mod <- e1071::svm(formula, data = data, kernel = "linear", scale = FALSE, probability = TRUE)}

train_size

numeric denoting the proportion of samples to use in the training set. Defaults to 0.75

n_resamples

integer denoting the number of resamples to calculate. Defaults to 30

by_set

Boolean specifying whether to compute classifiers for each feature set. Defaults to TRUE. If FALSE, the function will instead find the best individually-performing features

use_null

Boolean whether to fit null models where class labels are shuffled in order to generate a null distribution that can be compared to performance on correct class labels. Defaults to FALSE

seed

integer to fix R's random number generator to ensure reproducibility. Defaults to 123

Value

list containing a named vector of train-test set sizes, and a data.frame of classification performance results

Author(s)

Trent Henderson

Examples


library(theft)

features <- theft::calculate_features(theft::simData,
  group_var = "process",
  feature_set = "catch22")

classifiers <- classify(features,
  by_set = FALSE,
  n_resamples = 3)


[Package theftdlc version 0.1.0 Index]