R: Calculating the log-loss incurred in prediction

log_loss {BCT}

R Documentation

Calculating the log-loss incurred in prediction

Description

Compute the log-loss incurred in BCT prediction with memory length D. Given an initial context (x_-D+1, ..., x₀) and training data (x₁, ..., x_n), the log-loss is computed in sequentially predicting the test data (x_n+1, ..., x_n+T). The function outputs the cummulative, normalized (per-sample) log-loss, at each prediction step; for more information see Kontoyiannis et al.(2020).

Usage

log_loss(input_data, depth, train_size, beta = NULL)

Arguments

`input_data`	the sequence to be analysed. The sequence needs to be a "character" object. See the examples section of BCT/kBCT functions on how to transform any dataset to a "character" object.
`depth`	maximum memory length.
`train_size`	number of samples used in the training set. The training set size should be at least equal to the depth.
`beta`	hyper-parameter of the model prior. Takes values between 0 and 1. If not initialised in the call function, the default value is 1-2^-m+1, where m is the size of the alphabet; for more information see Kontoyiannis et al. (2020).

Value

returns a vector containing the averaged log-loss incurred in the sequential prediction at each time-step.

Examples

# Compute the log-loss in the prediction of the last 10 elements 
# of a dataset. 
log_loss(pewee, 5, nchar(pewee) - 10)

# For custom beta (e.g. 0.7):
log_loss(pewee, 5, nchar(pewee) - 10, 0.7)

[Package BCT version 1.2 Index]