isodistrreg-package {isodistrreg} | R Documentation |
Isotonic distributional regression (IDR)
Description
Isotonic distributional Regression (IDR) is a nonparametric method to estimate conditional distributions under monotonicity constraints.
How does it work?
Read the arXiv preprint ‘Isotonic Distributional Regression’ on
https://arxiv.org/abs/1909.03725 or by calling
browseVignettes(package = "isodistrreg")
.
The isodistrreg package
To make probabilistic forecasts with IDR,
call
idr(y = y, X = X, ...)
, wherey
is the response variable (e.g. weather variable observations) andX
is adata.frame
of covariates (e.g. ensemble forecasts).use
predict(fit, data)
, wherefit
is the model fit computed withidr
anddata
is the data based on which you want to make predictions.Try
idrbag
for IDR with (su)bagging.
The following pre-defined functions are available to evaluate IDR predictions:
-
cdf
andqpred
to compute the cumulative distribution function (CDF) and quantile function of IDR predictions. -
bscore
andqscore
to calculate Brier scores for probability forecasts for threshold exceedance (e.g. probability of precipitation) and quantile scores (e.g. mean absolute error of median forecast.) -
crps
to compute the continuous ranked probability score (CRPS). -
pit
to compute the probability integral transform (PIT). -
plot
to plot IDR predictive CDFs.
Use the dataset rain
to test IDR.
References
Alexander Henzi, Johanna F. Ziegel, and Tilmann Gneiting. Isotonic Distributional Regression. arXiv e-prints, art. arXiv:1909.03725, Sep 2019. URL https://arxiv.org/abs/1909.03725.
Examples
## A usage example:
# Prepare dataset: Half of the data as training dataset, other half for validation.
# Consult the R documentation (?rain) for details about the dataset.
data(rain)
trainingData <- subset(rain, date <= "2012-01-09")
validationData <- subset(rain, date > "2012-01-09")
# Variable selection: use HRES and the perturbed forecasts P1, ..., P50
varNames <- c("HRES", paste0("P", 1:50))
# Partial orders on variable groups: Usual order of numbers on HRES (group '1') and
# increasing convex order on the remaining variables (group '2').
groups <- setNames(c(1, rep(2, 50)), varNames)
orders <- c("comp" = 1, "icx" = 2)
# Fit IDR to training dataset.
fit <- idr(
y = trainingData[["obs"]],
X = trainingData[, varNames],
groups = groups,
orders = orders
)
# Make prediction for the first day in the validation data:
firstPrediction <- predict(fit, data = validationData[1, varNames])
plot(firstPrediction)
# Use cdf() and qpred() to make probability and quantile forecasts:
## What is the probability of precipitation?
1 - cdf(firstPrediction, thresholds = 0)
## What are the predicted 10%, 50% and 90% quantiles for precipitation?
qpred(firstPrediction, quantiles = c(0.1, 0.5, 0.9))
# Make predictions for the complete verification dataset and compare IDR calibrated
# forecasts to the raw ensemble (ENS):
predictions <- predict(fit, data = validationData[, varNames])
y <- validationData[["obs"]]
## Continuous ranked probability score (CRPS):
CRPS <- cbind(
"ens" = crps(validationData[, varNames], y),
"IDR" = crps(predictions, y)
)
apply(CRPS, 2, mean)
## Brier score for probability of precipitation:
BS <- cbind(
"ens" = bscore(validationData[, varNames], thresholds = 0, y),
"IDR" = bscore(predictions, thresholds = 0, y)
)
apply(BS, 2, mean)
## Quantile score of forecast for 90% quantile:
QS90 <- cbind(
"ens" = qscore(validationData[, varNames], quantiles = 0.9, y),
"IDR" = qscore(predictions, quantiles = 0.9, y)
)
apply(QS90, 2, mean)
## Check calibration using (randomized) PIT histograms:
pitEns <- pit(validationData[, varNames], y)
pitIdr <- pit(predictions, y)
hist(pitEns, main = "PIT of raw ensemble forecasts", freq = FALSE)
hist(pitIdr, main = "PIT of IDR calibrated forecasts", freq = FALSE)