regsimq {lmomRFA} | R Documentation |
Compute error bounds for a fitted regional frequency distribution
Description
Computes, using Monte Carlo simulation, relative error bounds
for estimated quantiles of a regional frequency distribution
fitted by regfit
.
Usage
regsimq(qfunc, para, cor = 0, index = NULL, nrec, nrep = 10000,
fit = "gev", f = c(0.01, 0.1, 0.5, 0.9, 0.99, 0.999),
boundprob = c(0.05, 0.95), save = TRUE)
Arguments
qfunc |
List containing the quantile functions for each site. Can also be a single quantile function, which will be used for each site. |
para |
Parameters of the quantile functions at each site.
If |
cor |
Specifies the correlation matrix of the frequency distribution of each site's data. Can be a matrix (which will be rescaled to a correlation matrix if necessary) or a constant (which will be taken as the correlation between each pair of sites). |
index |
Specifies the value of the site-specific scale factor
(“index flood”) at each site. Can be:
a vector, containing the values at each site;
a constant, which will be taken to be the index flood value at each site;
or (the default) |
nrec |
Numeric vector containing the record lengths at each site. |
nrep |
Number of simulated regions. |
fit |
Character string specifying the distribution to be fitted. See “Details” below. |
f |
Vector of probabilities corresponding to the quantiles whose accuracy is to be estimated. |
boundprob |
Vector of probabilities for which error bounds will be computed. |
save |
Logical. Should the simulated values of the ratio of
the estimated to the true regional growth curve be saved?
These values are needed when |
Details
A realization of data from a region is generated as follows.
The frequency distributions at sites (specified by
arguments qfunc
and para
) are expressed
as Q_i(F)=\mu_i q_i(F)
where \mu_i
is the site-specific scale factor
(“index flood”) and q_i(F)
is the at-site growth curve.
At each simulation run the at-site growth curves of each site
are randomly permuted, and are scaled by the (unpermuted)
index flood values for the sites.
Data are simulated from these frequency distributions,
with inter-site correlation specified by argument cor
and record lengths at each site specified by argument nrec
.
The regional frequency distribution specified by argument fit
is then fitted to the simulated data, as in function regfit
.
The procedure is as described in Hosking and Wallis (1997), Table 6.1,
except that the permutation of at-site growth curves is a later
modification, intended to give more realistic sets of simulated data.
For more details, including exact definitions of quantities computed
in the simulation and returned by functions regsimq
,
regquantbounds
, and regsitebounds
, see vignette RegSim
.
From each realization the sample mean values and estimates of the
quantiles of the regional growth curve, for nonexceedance probabilities
specified by argument f
, are saved.
From the simulated values, for each quantile specified by argument f
the relative root mean square error (relative RMSE) is computed as in
Hosking and Wallis (1997, eq. (6.15)).
Error bounds are also computed, as in Hosking and Wallis (1997, eq. (6.18))
but with bound probabilities specified by argument boundprob
rather than the fixed values 0.05 and 0.95 considered by Hosking and Wallis.
The error bounds are sample quantiles, across the nrep
realizations,
of the ratio of the estimated regional growth curve
to the true at-site growth curve
or of the ratio of the estimated to the true quantiles at individual sites.
For distribution fit
there should exist a function to estimate
the parameters of the distribution given a set of L
-moments.
The function should have a name that is the character string
"pel"
followed by the character string fit
.
It should accept a single argument, a vector containing L
-moments
\ell_1
, \ell_2
, t_3
, t_4
, etc.,
and return a vector of distribution parameters.
For distribution fit
there should also exist a quantile function,
which should have a name that is the character string
"qua"
followed by the character string fit
.
It should accept two arguments: a vector of probabilities
and a vector containing the parameters of the distribution.
The search path used to find the "pel"
and "qua"
functions
is the same as for arguments supplied to regsimq
, i.e.
the enclosing frames of the function, followed by the search path
specified by search()
.
The estimation routines and quantile functions in package lmom
have the form described here. For example, to use a
generalized extreme value distribution set fit
to be
the string "gev"
; then the fitting function pelgev
and the quantile function quagev
will be used
(unless these functions have been masked by another object
on the search path).
Value
An object of class "regsimq"
.
This is a list with the following components:
f |
Vector of probabilities corresponding to the quantiles
whose accuracy is to be estimated. A copy of argument |
boundprob |
Vector of probabilities corresponding to the error bounds.
A copy of argument |
nrep |
Number of simulated regions. |
relbounds.rgc |
Data frame containing the relative RMSE and
error bounds for the regional growth curve. It has columns
|
relbounds.by.site |
List of |
true.asgc |
If If |
sim.rgc |
If If |
Note
Error bounds for the regional growth curve apply to the
regional growth curve regarded as an estimator of the
growth curve for a randomly chosen site.
The growth curve for site i
is defined to be
q_i(\,.\,)=Q_i(\,.\,)/\mu_i
where Q_i(\,.\,)
is the site's quantile function
and \mu_i
is the site-specific scale factor
(“index flood”) for site i
.
For each of the nrep
simulated regions,
and each probability F
in f
,
the regional growth curve \hat{q}(F)
is estimated
and the ratios \hat{q}(F)/q_i(F)
are calculated.
The relative error bounds are empirical quantiles,
corresponding to the probabilities in boundprob
,
of the nrep * length(nrec)
values of these ratios
obtained from the simulations.
This differs from the approach taken in Hosking and Wallis (1997),
Table 6.2 and Fig. 6.2, in which error bounds are computed regarding
the estimated regional growth curve as an estimator of the regional
average growth curve q^{\rm R}(\,.\,)
, the harmonic mean
of the sites' growth curves (Hosking and Wallis, 1997, p. 102).
When the parent region is homogeneous, with identical frequency distributions at each site (apart from a site-specific scale factor), the two approaches give identical results. For heterogeneous regions the “regard as estimator of randomly chosen site growth curve” approach yields error bounds that are wider, but are more realistic given that the ultimate aim of regional frequency analysis is estimation of quantiles at individual sites.
Author(s)
J. R. M. Hosking jrmhosking@gmail.com
References
Hosking, J. R. M., and Wallis, J. R. (1997).
Regional frequency analysis: an approach based on L
-moments.
Cambridge University Press.
See Also
regfit
, for details of fitting a regional frequency distribution;
regquantbounds
and sitequantbounds
, for
converting the relative bounds returned by regsimq
into absolute bounds
for quantiles of the regional growth curve or of the
frequency distributions at individual sites.
Examples
data(Cascades) # A regional data set
rmom <- regavlmom(Cascades) # Regional average L-moments
# Fit generalized normal distribution to regional data
rfit <- regfit(Cascades, "gno")
# Set up an artificial region to be simulated:
# -- Same number of sites as Cascades
# -- Same record lengths as Cascades
# -- Same site means as Cascades
# -- L-CV varies linearly across sites, with mean value equal
# to the regional average L-CV for the Cascades data.
# 'LCVrange' specifies the range of L-CV across the sites.
# -- L-skewness the same at each site, and equal to the regional
# average L-skewness for the Cascades data
nsites <- nrow(Cascades)
means <- Cascades$mean
LCVrange <- 0.025
LCVs <- seq(rmom[2]-LCVrange/2, rmom[2]+LCVrange/2, len=nsites)
Lskews<-rep(rmom[3], nsites)
# Each site will have a generalized normal distribution:
# get the parameter values for each site
pp <- t(apply(cbind(means, means*LCVs ,Lskews), 1, pelgno))
# Set correlation between each pair of sites to 0.64, the
# average inter-site correlation for the Cascades data
avcor <- 0.64
# Run the simulation. To save time, use only 100 replications.
simq <- regsimq(qfunc=quagno, para=pp, cor=avcor, nrec=Cascades$n,
nrep=100, fit="gno")
# Relative RMSE and error bounds for the regional growth curve
simq$relbounds.rgc
# Relative RMSE and error bounds for quantiles at site 3
simq$relbounds.by.site[[3]]