mean_squared_error {ocf} | R Documentation |
Accuracy Measures for Ordered Probability Predictions
Description
Accuracy measures for evaluating ordered probability predictions.
Usage
mean_squared_error(y, predictions, use.true = FALSE)
mean_absolute_error(y, predictions, use.true = FALSE)
mean_ranked_score(y, predictions, use.true = FALSE)
classification_error(y, predictions)
Arguments
y |
Either the observed outcome vector or a matrix of true probabilities. |
predictions |
Predictions. |
use.true |
If |
Details
MSE, MAE, and RPS
When calling one of mean_squared_error
, mean_absolute_error
, or mean_ranked_score
,
predictions
must be a matrix of predicted class probabilities, with as many rows as observations in y
and as
many columns as classes of y
.
If use.true == FALSE
, the mean squared error (MSE), the mean absolute error (MAE), and the mean ranked probability score
(RPS) are computed as follows:
MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M (1 (Y_i = m) - \hat{p}_m (x))^2
MAE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M |1 (Y_i = m) - \hat{p}_m (x)|
RPS = \frac{1}{n} \sum_{i = 1}^n \frac{1}{M - 1} \sum_{m = 1}^M (1 (Y_i \leq m) - \hat{p}_m^* (x))^2
If use.true == TRUE
, the MSE, the MAE, and the RPS are computed as follows (useful for simulation studies):
MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M (p_m (x) - \hat{p}_m (x))^2
MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M |p_m (x) - \hat{p}_m (x)|
RPS = \frac{1}{n} \sum_{i = 1}^n \frac{1}{M - 1} \sum_{m = 1}^M (p_m^* (x) - \hat{p}_m^* (x))^2
where:
p_m (x) = P(Y_i = m | X_i = x)
p_m^* (x) = P(Y_i \leq m | X_i = x)
Classification error
When calling classification_error
, predictions
must be a vector of predicted class labels.
Classification error (CE) is computed as follows:
CE = \frac{1}{n} \sum_{i = 1}^n 1 (Y_i \neq \hat{Y}_i)
where Y_i are the observed class labels.
Value
The MSE, the MAE, the RPS, or the CE of the method.
Author(s)
Riccardo Di Francesco
See Also
Examples
## Load data from orf package.
set.seed(1986)
library(orf)
data(odata)
odata <- odata[1:100, ] # Subset to reduce elapsed time.
y <- as.numeric(odata[, 1])
X <- as.matrix(odata[, -1])
## Training-test split.
train_idx <- sample(seq_len(length(y)), floor(length(y) * 0.5))
y_tr <- y[train_idx]
X_tr <- X[train_idx, ]
y_test <- y[-train_idx]
X_test <- X[-train_idx, ]
## Fit ocf on training sample.
forests <- ocf(y_tr, X_tr)
## Accuracy measures on test sample.
predictions <- predict(forests, X_test)
mean_squared_error(y_test, predictions$probabilities)
mean_ranked_score(y_test, predictions$probabilities)
classification_error(y_test, predictions$classification)