R: Calculating the 0-1 loss incurred in prediction

zero_one_loss {BCT}

R Documentation

Calculating the 0-1 loss incurred in prediction

Description

Compute the 0-1 loss, i.e., the proportion of incorrectly predicted values, incurred in BCT prediction with memory length D. Given an initial context (x_-D+1, ..., x₀) and training data (x₁, ..., x_n), the 0-1 loss is computed in sequentially predicting the test data (x_n+1, ..., x_n+T). The function outputs the cummulative, normalized (per-sample) 0-1 loss, at each prediction step; for more information see Kontoyiannis et al. (2020).

Usage

zero_one_loss(input_data, depth, train_size, beta = NULL)

Arguments

`input_data`	the sequence to be analysed. The sequence needs to be a "character" object. See the examples section of kBCT/BCT functions on how to transform any dataset to a "character" object.
`depth`	maximum memory length.
`train_size`	number of samples used in the training set. The training set size should be at least equal to the depth.
`beta`	hyper-parameter of the model prior. Takes values between 0 and 1. If not initialised in the call function, the default value is 1-2^-m+1, where m is the size of the alphabet; for more information see Kontoyiannis et al. (2020)

Value

returns a vector containing the averaged number of errors at each timestep.

Examples

# Use the pewee dataset and look at the last 8 elements:
  substring(pewee, nchar(pewee)-7, nchar(pewee)) 
# [1] "10001001"

# Predict last 8 elements using the prediction function
pred <- prediction(pewee, 10, nchar(pewee)-8)[["Prediction"]] 
# Taking only the "Prediction" vector:

pred
# [1] "1" "0" "0" "1" "1" "0" "0" "1"

# To transform the result of the prediction function into a "character" object:
paste(pred, collapse = "")
# [1] "10011001"

# As observed, there is only 1 error (the sixth predicted element is 1 instead of a 0). 
# Thus, up to the 4th place, the averaged error is 0 
# and the sixth averaged error is expected to be 1/4. 
# Indeed, the zero_one_loss function yields the expected answer: 

zero_one_loss(pewee, 10, nchar(pewee)-8) 
# [1] 0.0000000 0.0000000 0.0000000 0.2500000 0.2000000 0.1666667 0.1428571 0.1250000

[Package BCT version 1.2 Index]