veriApply {easyVerification} | R Documentation |
Apply Verification Metrics to Large Datasets
Description
This wrapper applies verification metrics to arrays of forecast ensembles and verifying observations. Various array-based data formats are supported. Additionally, continuous forecasts (and observations) are transformed to category forecasts using user-defined absolute thresholds or percentiles of the long-term climatology (see details).
Usage
veriApply(
verifun,
fcst,
obs,
fcst.ref = NULL,
tdim = length(dim(fcst)) - 1,
ensdim = length(dim(fcst)),
prob = NULL,
threshold = NULL,
strategy = "none",
na.rm = FALSE,
fracmin = 0.8,
nmin = NULL,
parallel = FALSE,
maxncpus = 16,
ncpus = NULL,
...
)
Arguments
verifun |
Name of function to compute verification metric (score, skill score) |
fcst |
array of forecast values (at least 2-dimensional) |
obs |
array or vector of verifying observations |
fcst.ref |
array of forecast values for the reference forecast (skill scores only) |
tdim |
index of dimension with the different forecasts |
ensdim |
index of dimension with the different ensemble members |
prob |
probability threshold for category forecasts (see below) |
threshold |
absolute threshold for category forecasts (see below) |
strategy |
type of out-of-sample reference forecasts or namelist with
arguments as in |
na.rm |
logical, should incomplete forecasts be used? |
fracmin |
fraction of forecasts that are not-missing for forecast to
be evaluated. Used to determine |
nmin |
number of forecasts that are not-missing for forecast to
be evaluated. If both |
parallel |
logical, should parallel execution of verification be used (see below)? |
maxncpus |
upper bound for self-selected number of CPUs |
ncpus |
number of CPUs used in parallel computation, self-selected
number of CPUs is used when |
... |
additional arguments passed to |
List of functions to be called
The selection of verification
functions supplied with this package and as part of
SpecsVerification
can be enquired using
ls(pos='package:easyVerification')
and
ls(pos='package:SpecsVerification')
respectively. Please note,
however, that only some of the functions provided as part of
SpecsVerification
can be used with veriApply
.
Functions that can be used include for example the (fair) ranked
probability score EnsRps
,
FairRps
, and its skill score
EnsRpss
,
FairRpss
, or the continuous ranked
probability score EnsCrps
, etc.
Conversion to category forecasts
To automatically convert
continuous forecasts into category forecasts, absolute (threshold
)
or relative thresholds (prob
) have to be supplied. For some scores
and skill scores (e.g. the ROC area and skill score), a list of categories
will be supplied with categories ordered. That is, if prob = 1:2/3
for tercile forecasts, cat1
corresponds to the lower tercile,
cat2
to the middle, and cat3
to the upper tercile.
Absolute and relative thresholds can be supplied in various formats. If a
vector of thresholds is supplied with the threshold
argument, the
same threshold is applied to all forecasts (e.g. lead times, spatial
locations). If a vector of relative thresholds is supplied using
prob
, the category boundaries to be applied are computed separately
for each space-time location. Relative boundaries specified using
prob
are computed separately for the observations and forecasts, but
jointly for all available ensemble members.
Location specific thresholds can also be supplied. If the thresholds are
supplied as a matrix, the number of rows has to correspond to the number of
forecast space-time locations (i.e. same length as
length(fcst)/prod(dim(fcst)[c(tdim, ensdim)])
). Alternatively, but
equivalently, the thresholds can also be supplied with the dimensionality
corresponding to the obs
array with the difference that the forecast
dimension in obs
contains the category boundaries (absolute or
relative) and thus may differ in length.
Out-of-sample reference forecasts
strategy
specifies the
set-up of the climatological reference forecast for skill scores if no
explicit reference forecast is provided. The default is strategy = "none"
,
that is all available observations are used as equiprobable
members of a reference forecast. Alternatively, strategy = "crossval"
can be used for leave-one-out crossvalidated reference forecasts,
or strategy = "forward"
for a forward protocol (see indRef
).
Alternatively, a list with named parameters corresponding to the input
arguments of indRef
can be supplied for more fine-grained
control over standard cases. Finally, also a list with observation indices
to be used for each forecast can be supplied (see generateRef
).
Parallel processing
Parallel processing is enabled using the
parallel
package. Parallel verification is using
ncpus
FORK
clusters or, if ncpus
are not specified,
one less than the autod-etected number of cores. The maximum number of cores
used for parallel processing with auto-detection of the number of available
cores can be set with the maxncpus
argument.
Progress bars are available for non-parallel computation of the verification metrics. Please note, however, that the progress bar only indicates the time of computation needed for the actual verification metrics, input and output re-arrangement is not included in the progress bar.
Note
If the forecasts and observations are only available as category
probabilities (or ensemble counts as used in SpecsVerification
) as
opposed to as continuous numeric variables, veriApply
cannot be used
but the atomic verification functions for category forecasts have to be
applied directly.
Out-of-sample reference forecasts are not fully supported for
categorical forecasts defined on the distribution of forecast values (e.g.
using the argument prob
). Whereas only the years specified in
strategy
are used for the reference forecasts, the probability
thresholds for the reference forecasts are defined on the collection of
years specified in strategy
.
See Also
convert2prob
for conversion of continuous into
category forecasts (and observations)
Examples
tm <- toyarray()
f.me <- veriApply("EnsMe", tm$fcst, tm$obs)
## find more examples and instructions in the vignette
## Not run:
devtools::install_github("MeteoSwiss/easyVerification", build_vignettes = TRUE)
library("easyVerification")
vignette("easyVerification")
## End(Not run)