RPSS {s2dv} | R Documentation |
Compute the Ranked Probability Skill Score
Description
The Ranked Probability Skill Score (RPSS; Wilks, 2011) is the skill score
based on the Ranked Probability Score (RPS; Wilks, 2011). It can be used to
assess whether a forecast presents an improvement or worsening with respect to
a reference forecast. The RPSS ranges between minus infinite and 1. If the
RPSS is positive, it indicates that the forecast has higher skill than the
reference forecast, while a negative value means that it has a lower skill.
Examples of reference forecasts are the climatological forecast (same
probabilities for all categories for all time steps), persistence, a previous
model version, and another model. It is computed as
RPSS = 1 - RPS_exp / RPS_ref
. The statistical significance is obtained
based on a Random Walk test at the specified confidence level (DelSole and
Tippett, 2016).
The function accepts either the ensemble members or the probabilities of
each data as inputs. If there is more than one dataset, RPSS will be
computed for each pair of exp and obs data. The NA ratio of data will be
examined before the calculation. If the ratio is higher than the threshold
(assigned by parameter na.rm
), NA will be returned directly. NAs are
counted by per-pair method, which means that only the time steps that all the
datasets have values count as non-NA values.
Usage
RPSS(
exp,
obs,
ref = NULL,
time_dim = "sdate",
memb_dim = "member",
cat_dim = NULL,
dat_dim = NULL,
prob_thresholds = c(1/3, 2/3),
indices_for_clim = NULL,
Fair = FALSE,
weights_exp = NULL,
weights_ref = NULL,
cross.val = FALSE,
na.rm = FALSE,
sig_method.type = "two.sided.approx",
alpha = 0.05,
ncores = NULL
)
Arguments
exp |
A named numerical array of either the forecast with at least time
and member dimensions, or the probabilities with at least time and category
dimensions. The probabilities can be generated by |
obs |
A named numerical array of either the observation with at least
time dimension, or the probabilities with at least time and category
dimensions. The probabilities can be generated by |
ref |
A named numerical array of either the reference forecast with at
least time and member dimensions, or the probabilities with at least time and
category dimensions. The probabilities can be generated by
|
time_dim |
A character string indicating the name of the time dimension. The default value is 'sdate'. |
memb_dim |
A character string indicating the name of the member dimension to compute the probabilities of the forecast and the reference forecast. The default value is 'member'. If the data are probabilities, set memb_dim as NULL. |
cat_dim |
A character string indicating the name of the category dimension that is needed when exp, obs, and ref are probabilities. The default value is NULL, which means that the data are not probabilities. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
prob_thresholds |
A numeric vector of the relative thresholds (from 0 to 1) between the categories. The default value is c(1/3, 2/3), which corresponds to tercile equiprobable categories. |
indices_for_clim |
A vector of the indices to be taken along 'time_dim' for computing the thresholds between the probabilistic categories. If NULL, the whole period is used. The default value is NULL. |
Fair |
A logical indicating whether to compute the FairRPSS (the potential RPSS that the forecast would have with an infinite ensemble size). The default value is FALSE. |
weights_exp |
A named numerical array of the forecast ensemble weights for probability calculation. The dimension should include 'memb_dim', 'time_dim' and 'dat_dim' if there are multiple datasets. All dimension lengths must be equal to 'exp' dimension lengths. The default value is NULL, which means no weighting is applied. The ensemble should have at least 70 members or span at least 10 time steps and have more than 45 members if consistency between the weighted and unweighted methodologies is desired. |
weights_ref |
Same as 'weights_exp' but for the reference forecast. |
cross.val |
A logical indicating whether to compute the thresholds between probabilistics categories in cross-validation. The default value is FALSE. |
na.rm |
A logical or numeric value between 0 and 1. If it is numeric, it means the lower limit for the fraction of the non-NA values. 1 is equal to FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). than na.rm. Otherwise, RPS will be calculated. The default value is FALSE. |
sig_method.type |
A character string indicating the test type of the
significance method. Check |
alpha |
A numeric of the significance level to be used in the statistical significance test. The default value is 0.05. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
Value
$rpss |
A numerical array of RPSS with dimensions c(nexp, nobs, the rest dimensions of 'exp' except 'time_dim' and 'memb_dim' dimensions). nexp is the number of experiment (i.e., dat_dim in exp), and nobs is the number of observation i.e., dat_dim in obs). If dat_dim is NULL, nexp and nobs are omitted. |
$sign |
A logical array of the statistical significance of the RPSS with the same dimensions as $rpss. |
References
Wilks, 2011; https://doi.org/10.1016/B978-0-12-385022-5.00008-7 DelSole and Tippett, 2016; https://doi.org/10.1175/MWR-D-15-0218.1
Examples
set.seed(1)
exp <- array(rnorm(3000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50))
set.seed(2)
obs <- array(rnorm(300), dim = c(lat = 3, lon = 2, sdate = 50))
set.seed(3)
ref <- array(rnorm(3000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50))
weights <- sapply(1:dim(exp)['sdate'], function(i) {
n <- abs(rnorm(10))
n/sum(n)
})
dim(weights) <- c(member = 10, sdate = 50)
# Use data as input
res <- RPSS(exp = exp, obs = obs) ## climatology as reference forecast
res <- RPSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast
res <- RPSS(exp = exp, obs = obs, ref = ref, weights_exp = weights, weights_ref = weights)
res <- RPSS(exp = exp, obs = obs, alpha = 0.01, sig_method.type = 'two.sided')
# Use probs as input
exp_probs <- GetProbs(exp, memb_dim = 'member')
obs_probs <- GetProbs(obs, memb_dim = NULL)
ref_probs <- GetProbs(ref, memb_dim = 'member')
res <- RPSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL,
cat_dim = 'bin')