regtst {lmomRFA} | R Documentation |
Test statistics for regional frequency analysis
Description
Computes discordancy, heterogeneity and goodness-of-fit measures
for regional frequency analysis.
These are the statistics D_i
, H
, and Z^{\rm DIST}
defined respectively in sections 3.2.3, 4.3.3, and 5.2.3 of
Hosking and Wallis (1997).
Usage
regtst(regdata, nsim=1000)
regtst.s(regdata, nsim=1000)
Arguments
regdata |
Object of class Note that the fourth column should contain values of
the Function |
nsim |
Number of simulations to use in the calculation of the heterogeneity and goodness-of-fit measures. If less than 2, only the discordancy measure will be calculated. |
Details
The discordancy measure D_i
indicates, for site i
,
the discordancy between the site's L
-moment ratios
and the (unweighted) regional average L
-moment ratios.
Large values might be used as a flag to indicate potential errors
in the data at the site. “Large” might be 3 for regions with 15
or more sites, but less (exact values in list element Dcrit
)
for smaller regions.
Three heterogeneity measures are calculated, each based on
a different measure of between-site dispersion of L
-moment ratios:
[1] weighted standard deviation of L
-CVs;
[2] average of L
-CV/L
-skew distances;
[3] average of L
-skew/L
-kurtosis distances.
These dispersion measures are the quantities V
, V_2
,
and V_3
defined respectively in equations (4.4), (4.6), and (4.7)
of Hosking and Wallis (1997).
The heterogeneity measures are calculated from them as in
equation (4.5) of Hosking and Wallis (1997).
In practice H[1]
is probably sufficient. A value greater than
(say) 1.0 suggests that further subdivision of the region should
be considered as it might improve the accuracy of quantile estimates.
Goodness of fit is evaluated for five candidate distributions:
generalized logistic,
generalized extreme value,
generalized normal (lognormal),
Pearson type III (3-parameter gamma), and
generalized Pareto.
In the output the distributions are referred to by 3-letter abbreviations,
respectively glo
, gev
, gno
, pe3
, and gpa
.
If the region is homogeneous and data at different sites are
statistically independent, then if one of the distributions is
the true distribution for the region its goodness-of-fit measure
should have approximately a standard normal distribution.
Provided that the region is acceptably close to homogeneous,
the fit may be judged acceptable at the 10 per cent significance level
if the Z
value is less than 1.645 (i.e., qnorm(0.95)
) in absolute value.
Calculation of heterogeneity and goodness-of-fit measures
involves the sampling variability of L
-moment ratios
in a homogeneous region whose record lengths and
average L
-moment ratios match those of the data.
The sampling variability is estimated by Monte Carlo simulation
using nsim
replications of the region.
Results will vary between invocations of regtst
with different seeds for the random-number generator.
In the homogeneous region used in the simulations, the sites have a
kappa distribution, fitted to the regional average L
-moment ratios
of the data in regdata
. The kappa fit may fail if the regional average
L
-kurtosis is high relative to the regional average L
-skewness.
In this case a kappa distribution is fitted with shape parameter
h
constrained to be -1
(i.e., a generalized logistic distribution);
this gives the largest possible L
-kurtosis value for a kappa distribution
with given L
-skewness.
regtst
and regtst.s
are functionally identical.
regtst
calls a Fortran routine internally and is faster,
typically by a factor of 3 or 4.
regtst.s
is written almost entirely in the S language;
it is provided so that users can see how the calculations are done,
and can conveniently alter the code for their own purposes if necessary.
Value
An object of class "regtst"
, which is a list with elements as follows.
data |
The input data, i.e. data frame |
nsim |
Number of simulations, i.e. the argument |
D |
Vector containing the discordancy measures for each site. |
Dcrit |
Vector of length 2 containing critical values of the discordancy measure corresponding to significance levels of 10 and 5 per cent — except that the values never exceed 3 and 4 respectively. See Hosking and Wallis (1997), section 3.2.4. |
rmom |
Vector of length 5 containing the regional weighted average
|
rpara |
Vector of length 4 containing the parameters of a kappa distribution
fitted to the regional weighted average |
vobs |
Vector of length 3 containing the observed values of the three
measures of between-site dispersion of |
vbar |
Vector of length 3 containing the mean of the simulated values of the three dispersion measures. |
vsd |
Vector of length 3 containing the standard deviation of the simulated values of the three dispersion measures. |
H |
Vector of length 3 containing the three measures of regional heterogeneity. |
para |
List of length 6 containing the parameters of the five candidate
distributions and the Wakeby distribution (3-letter abbreviation
|
t4fit |
Vector of length 5 containing the |
Z |
Vector of length 5 containing the goodness-of-fit measures for each of the five candidate distributions. |
Note
Data frame regdata
may have only six columns,
i.e. the fifth L
-moment ratio t_5
may be omitted.
In this case the return value will contain missing values for
rmom[5]
and the elements of para$wak
.
Author(s)
J. R. M. Hosking jrmhosking@gmail.com
References
Hosking, J. R. M. (1996).
Fortran routines for use with the method of L
-moments, Version 3.
Research Report RC20525, IBM Research Division, Yorktown Heights, N.Y.
Hosking, J. R. M., and Wallis, J. R. (1997).
Regional frequency analysis: an approach based on L
-moments.
Cambridge University Press.
See Also
summary.regtst
for summaries.
Examples
# An example from Hosking (1996). Compare the output with
# the file 'cascades.out' in the LMOMENTS Fortran package at
# http://lib.stat.cmu.edu/lmoments/general (results will not
# be identical, because random-number generators are different).
summary(regtst(Cascades, nsim=500))
# Output from 'regsamlmu' can be fed straight into 'regtst'
regtst(regsamlmu(Maxwind))