predict.idrfit {isodistrreg}R Documentation

Predict method for IDR fits

Description

Prediction based on IDR model fit.

Usage

## S3 method for class 'idrfit'
predict(object, data = NULL, digits = 3, interpolation = "linear", ...)

Arguments

object

IDR fit (object of class "idrfit").

data

optional data.frame containing variables with which to predict. In-sample predictions are returned if this is omitted. Ordered factor variables are converted to numeric for computation, so ensure that the factor levels are identical in data and the training data for fit.

digits

number of decimal places for the predictive CDF.

interpolation

interpolation method for univariate data. Default is "linear". Any other argument will select midpoint interpolation (see 'Details'). Has no effect for multivariate IDR.

...

included for generic function consistency.

Details

If the variables x = data[j,] for which predictions are desired are already contained in the training dataset X for the fit, predict.idrfit returns the corresponding in-sample prediction. Otherwise monotonicity is used to derive upper and lower bounds for the predictive CDF, and the predictive CDF is a pointwise average of these bounds. For univariate IDR with a numeric covariate, the predictive CDF is computed by linear interpolation. Otherwise, or if interpolation != "linear", midpoint interpolation is used, i.e. default weights of 0.5 for both the lower and the upper bound.

If the lower and the upper bound on the predictive cdf are far apart (or trivial, i.e. constant 0 or constant 1), this indicates that the prediction based on x is uncertain because either the training dataset is too small or only few similar variable combinations as in x have been observed in the training data. However, the bounds on the predictive CDF are not prediction intervals and should not be interpreted as such. They only indicate the uncertainty of out-of-sample predictions for which the variables are not contained in the training data.

If the new variables x are greater than all X[i, ] in the selected order(s), the lower bound on the cdf is trivial (constant 0) and the upper bound is taken as predictive cdf. The upper bound on the cdf is trivial (constant 1) if x is smaller than all X[i, ]. If x is not comparable to any row of X in the given order, a prediction based on the training data is not possible. In that case, the default forecast is the empirical distribution of y in the training data.

Value

A list of predictions. Each prediction is a data.frame containing the following variables:

points

the points where the predictive CDF has jumps.

cdf

the estimated CDF evaluated at the points.

lower, upper

(only for out-of-sample predictions) bounds for the estimated CDF, see 'Details' above.

The output has the attribute incomparables, which gives the indices of all predictions for which the climatological forecast is returned because the forecast variables are not comparable to the training data.

See Also

idr to fit IDR to training data.

cdf, qpred to evaluate the CDF or quantile function of IDR predictions.

bscore, qscore, crps, pit to compute Brier scores, quantile scores, the CRPS and the PIT of IDR predictions.

plot to plot IDR predictive CDFs.

Examples

data("rain")

## Fit IDR to data of 185 days using componentwise order on HRES and CTR and
## increasing convex order on perturbed ensemble forecasts (P1, P2, ..., P50)

varNames <- c("HRES", "CTR", paste0("P", 1:50))
X <- rain[1:185, varNames]
y <- rain[1:185, "obs"]

## HRES and CTR are group '1', with componentwise order "comp", perturbed
## forecasts P1, ..., P50 are group '2', with "icx" order

groups <- setNames(c(1, 1, rep(2, 50)), varNames)
orders <- c("comp" = 1, "icx" = 2)

fit <- idr(y = y, X = X, orders = orders, groups = groups)

## Predict for day 186
predict(fit, data = rain[186, varNames])

[Package isodistrreg version 0.1.0 Index]