Compute the log-loss incurred in BCT prediction with memory length D. Given an initial context (x_{-D+1}, ..., x_{0}) and training data (x_{1}, ..., x_{n}), the log-loss is computed in sequentially predicting the test data (x_{n+1}, ..., x_{n+T}). The function outputs the cummulative, normalized (per-sample) log-loss, at each prediction step; for more information see Kontoyiannis et al.(2020).
log_loss(input_data, depth, train_size, beta = NULL)
input_data
the sequence to be analysed. The sequence needs to be a "character" object. See the examples section of BCT/kBCT functions on how to transform any dataset to a "character" object. |
depth
maximum memory length. |
train_size
number of samples used in the training set. The training set size should be at least equal to the depth. |
beta
hyper-parameter of the model prior. Takes values between 0 and 1. If not initialised in the call function, the default value is 1-2^{-m+1}, where m is the size of the alphabet; for more information see Kontoyiannis et al. (2020). |
returns a vector containing the averaged log-loss incurred in the sequential prediction at each time-step.
# Compute the log-loss in the prediction of the last 10 elements
# of a dataset.
log_loss(pewee, 5, nchar(pewee) - 10)
# For custom beta (e.g. 0.7):
log_loss(pewee, 5, nchar(pewee) - 10, 0.7)