| validate.predictions {georob} | R Documentation |
Summary Statistics of (Cross-)Validation Prediction Errors
Description
Functions to compute and plot summary statistics of prediction errors to (cross-)validate fitted spatial linear models by the criteria proposed by Gneiting et al. (2007) for assessing probabilistic forecasts.
Usage
validate.predictions(data, pred, se.pred,
statistic = c("crps", "pit", "mc", "bs", "st"), ncutoff = NULL)
## S3 method for class 'cv.georob'
plot(x,
type = c("sc", "lgn.sc", "ta", "qq", "hist.pit", "ecdf.pit", "mc", "bs"),
smooth = TRUE, span = 2/3, ncutoff = NULL, add = FALSE,
col, pch, lty, main, xlab, ylab, ...)
## S3 method for class 'cv.georob'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'cv.georob'
summary(object, se = FALSE, ...)
Arguments
data |
a numeric vector with observations about a response (mandatory argument). |
pred |
a numeric vector with predictions for the response (mandatory argument). |
se.pred |
a numeric vector with prediction standard errors (mandatory argument). |
statistic |
a character keyword defining what statistic of the prediction errors should be computed. Possible values are (see Details):
|
ncutoff |
a positive integer ( |
x, object |
objects of class |
digits |
a positive integer indicating the number of decimal digits to print. |
type |
a character keyword defining what type of plot is created by the
|
smooth |
a logical scalar controlling whether scatter plots of data
vs. predictions should be smoothed by
|
span |
a numeric with the smoothness parameter for loess (see
|
add |
a logical scalar controlling whether the current high-level plot is
added to an existing graphics without cleaning the frame before (default:
|
main, xlab, ylab |
title and axes labels of plot. |
col, pch, lty |
color, symbol and line type. |
se |
a logical scalar controlling if the standard errors of the
averaged continuous ranked probability score and of the mean and
dispersion statistics of the prediction errors (see Details) are
computed from the respective values of the |
... |
additional arguments passed to the methods. |
Details
validate.predictions computes the items required to evaluate (and
plot) the diagnostic criteria proposed by Gneiting et al. (2007) for
assessing the calibration and the sharpness of
probabilistic predictions of (cross-)validation data. To this aim,
validate.predictions uses the assumption that the prediction
errors
Y(\boldsymbol{s})-\widehat{Y}(\boldsymbol{s})
follow normal distributions with zero mean and standard deviations equal
to the Kriging standard errors. This assumption is an approximation if
the errors \varepsilon come from a long-tailed
distribution. Furthermore, for the time being, the Kriging variance of
the response Y is approximated by adding the estimated
nugget \widehat{\tau}^2 to the Kriging variance of the
signal Z. This approximation likely underestimates the mean
squared prediction error of the response if the errors come from a
long-tailed distribution. Hence, for robust Kriging, the standard errors of
the (cross-)validation errors are likely too small.
Notwithstanding these difficulties and imperfections, validate.predictions computes
the probability integral transform (PIT),
\mathrm{PIT}_i = F_i(y_i),where
F_i(y_i)denotes the (plug-in) predictive CDF evaluated aty_i, the value of theith (cross-)validation datum,the average predictive CDF (plug-in)
\bar{F}_n(y)=1/n \sum_{i=1}^n F_i(y),where
nis the number of (cross-)validation observations and theF_iare evaluated atNquantiles equal to the set of distincty_i(or a subset of sizeNof them),the Brier Score (plug-in)
\mathrm{BS}(y) = 1/n \sum_{i=1}^n \left(F_i(y) - I(y_i \leq y) \right)^2,where
I(x)is the indicator function for the eventx, and the Brier score is again evaluated at the unique values of the (cross-)validation observations (or a subset of sizeNof them),the averaged continuous ranked probability score, CRPS, a strictly proper scoring criterion to rank predictions, which is related to the Brier score by
\mathrm{CRPS} = \int_{-\infty}^\infty \mathrm{BS}(y) \,dy.
Gneiting et al. (2007) proposed the following plots to validate probabilistic predictions:
A histogram (or a plot of the empirical CDF) of the PIT values. For ideal predictions, with observed coverages of prediction intervals matching nominal coverages, the PIT values have an uniform distribution.
Plots of
\bar{F}_n(y)and of the empirical CDF of the data, say\widehat{G}_n(y), and of their difference,\bar{F}_n(y)-\widehat{G}_n(y)vsy. The forecasts are said to be marginally calibrated if\bar{F}_n(y)and\widehat{G}_n(y)match.A plot of
\mathrm{BS}(y)vs.y. Probabilistic predictions are said to be sharp if the area under this curve, which equals CRPS, is minimized.
The plot method for class cv.georob allows to create
these plots, along with scatter-plots of observations and predictions,
Tukey-Anscombe and normal QQ plots of the standardized prediction
errors.
summary.cv.georob computes the mean and dispersion statistics
of the (standardized) prediction errors (by a call to
validate.prediction with argument statistic = "st", see
Value) and the averaged continuous ranked probability score
(crps). If present in the cv.georob object, the error
statistics are also computed for the errors of the unbiasedly
back-transformed predictions of a log-transformed response. If se
is TRUE then these statistics are evaluated separately for the
K cross-validation subsets and the standard errors of the means of
these statistics are returned as well.
The print method for class cv.georob returns the mean
and dispersion statistics of the (standardized) prediction errors.
Value
Depending on the argument statistic, the function
validate.predictions returns
a numeric vector of PIT values if
statisticis equal to"pit",a named numeric vector with summary statistics of the (standardized) prediction errors if
statisticis equal to"st". The following statistics are computed:memean prediction error medemedian prediction error rmseroot mean squared prediction error mademedian absolute prediction error qneQn dispersion measure of prediction errors (see Qn)mssemean squared standardized prediction error medssemedian squared standardized prediction error a data frame if
statisticis equal to"mc"or"bs"with the components (see Details):zthe sorted unique (cross-)validation observations (or a subset of size ncutoffof them)ghatthe empirical CDF of the (cross-)validation observations \widehat{G}_n(y)fbarthe average predictive distribution \bar{F}_n(y)bsthe Brier score \mathrm{BS}(y)
The method print.cv.georob invisibly returns the object unchanged.
The method summary.cv.georob returns an object of class
summary.cv.georob which is a list with 3 components:
-
sta numeric vector with summary statistics of the (standardized) prediction errors of the possibly log-transformed response, see output of functionvalidate.predictionsfor argumentstatistic = "st"above. -
crpsthe value of the continuous ranked probability score. -
st.lgna numeric vector with summary statistics of the (standardized) prediction errors of the back-transformed response if argumentlgn = TRUEandNULLotherwise.
There is a print method for objects of class summary.cv.georob
which invisibly returns the object unchanged.
The method plot.georob is called for its side effects and
invisibly returns NULL.
Author(s)
Andreas Papritz papritz@retired.ethz.ch.
References
Gneiting, T., Balabdaoui, F. and Raftery, A. E.(2007) Probabilistic
forecasts, calibration and sharpness. Journal of the Royal
Statistical Society Series B 69, 243–268,
doi:10.1111/j.1467-9868.2007.00587.x.
See Also
georob for (robust) fitting of spatial linear models;
cv.georob for assessing the goodness of a fit by georob.
Examples
## define number of cores for parallel computations
if(interactive()) ncpu <- 10L else ncpu <- 1L
data(meuse)
r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
variogram.model = "RMexp",
param = c(variance = 0.15, nugget = 0.05, scale = 200),
tuning.psi = 1000)
if(interactive()){
## example is run only in interactive session because cpu times exceeds 5 s
r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE, ncores = 1, verbose = 1)
r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE,
ncores = ncpu)
summary(r.logzn.cv.1, se = TRUE)
summary(r.logzn.cv.2, se = TRUE)
op <- par(mfrow = c(2,2))
plot(r.logzn.cv.1, type = "lgn.sc")
plot(r.logzn.cv.2, type = "lgn.sc", add = TRUE, col = "red")
abline(0, 1, lty= "dotted")
plot(r.logzn.cv.1, type = "ta")
plot(r.logzn.cv.2, type = "ta", add = TRUE, col = "red")
abline(h=0, lty= "dotted")
plot(r.logzn.cv.2, type = "mc", col = "red")
plot(r.logzn.cv.1, type = "bs")
plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red")
legend("topright", lty = 1, col = c("black", "red"), bty = "n",
legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq"))
par(op)
}