decluster {extRemes} | R Documentation |
Decluster Data Above a Threshold
Description
Decluster data above a given threshold to try to make them independent.
Usage
decluster(x, threshold, ...)
## S3 method for class 'data.frame'
decluster(x, threshold, ..., which.cols, method = c("runs", "intervals"),
clusterfun = "max")
## Default S3 method:
decluster(x, threshold, ..., method = c("runs", "intervals"),
clusterfun = "max")
## S3 method for class 'intervals'
decluster(x, threshold, ..., clusterfun = "max", groups = NULL, replace.with,
na.action = na.fail)
## S3 method for class 'runs'
decluster(x, threshold, ..., data, r = 1, clusterfun = "max", groups = NULL,
replace.with, na.action = na.fail)
## S3 method for class 'declustered'
plot(x, which.plot = c("scatter", "atdf"), qu = 0.85, xlab = NULL,
ylab = NULL, main = NULL, col = "gray", ...)
## S3 method for class 'declustered'
print(x, ...)
Arguments
x |
An R data set to be declustered. Can be a data frame or a numeric vector. If a data frame, then
|
data |
A data frame containing the data. |
threshold |
numeric of length one or the size of the data over which (non-inclusive) data are to be declustered. |
qu |
quantile for |
which.cols |
numeric of length one or two. The first component tells which column is the one to decluster, and the second component tells which, if any, column is to serve as groups. |
which.plot |
character string naming the type of plot to make. |
method |
character string naming the declustering method to employ. |
clusterfun |
character string naming a function to be applied to the clusters (the returned value is used). Typically, for extreme value analysis (EVA), this will be the cluster maximum (default), but other options are ok as long as they return a single number. |
groups |
numeric of length |
r |
integer run length stating how many threshold deficits should be used to define a new cluster. |
replace.with |
number, NaN, Inf, -Inf, or NA. What should the remaining values in the cluster be replaced with? The default replaces them with |
na.action |
function to be called to handle missing values. |
xlab , ylab , main , col |
optioal arguments to the |
... |
optional arguments to
Not used by |
Details
Runs declustering (see Coles, 2001 sec. 5.3.2): Extremes separated by fewer than r
non-extremes belong to the same cluster.
Intervals declustering (Ferro and Segers, 2003): Extremes separated by fewer than r
non-extremes belong to the same cluster, where r
is the nc-th largest interexceedance time and nc, the number of clusters, is estimated from the extremal index, theta, and the times between extremes. Setting theta = 1 causes each extreme to form a separate cluster.
The print statement will report the resulting extremal index estimate based on either the runs or intervals estimate depending on the method
argument as well as the number of clusters and run length. For runs declustering, the run length is the same as the argument given by the user, and for intervals method, it is an estimated run length for the resulting declustered data. Note that if the declustered data are independent, the extremal index should be close to one (if not equal to 1).
Value
A numeric vector of class “declustered” is returned with various attributes including:
call |
the function call. |
data.name |
character string giving the name of the data. |
decluster.function |
value of |
method |
character string naming the method. Same as input argument. |
threshold |
threshold used for declustering. |
groups |
character string naming the data used for the groups when applicable. |
run.length |
the run length used (or estimated if “intervals” method employed). |
na.action |
function used to handle missing values. Same as input argument. |
clusters |
muneric giving the clusters of threshold exceedances. |
Author(s)
Eric Gilleland
References
Coles, S. (2001) An introduction to statistical modeling of extreme values, London, U.K.: Springer-Verlag, 208 pp.
Ferro, C. A. T. and Segers, J. (2003). Inference for clusters of extreme values. Journal of the Royal Statistical Society B, 65, 545–556.
See Also
extremalindex
, datagrabber
, fevd
Examples
y <- rnorm(100, mean=40, sd=20)
y <- apply(cbind(y[1:99], y[2:100]), 1, max)
bl <- rep(1:3, each=33)
ydc <- decluster(y, quantile(y, probs=c(0.75)), r=1, groups=bl)
ydc
plot(ydc)
## Not run:
look <- decluster(-Tphap$MinT, threshold=-73)
look
plot(look)
# The code cannot currently grab data of the type of above.
# Better:
y <- -Tphap$MinT
look <- decluster(y, threshold=-73)
look
plot(look)
# Even better. Use a non-constant threshold.
u <- -70 - 7 *(Tphap$Year - 48)/42
look <- decluster(y, threshold=u)
look
plot(look)
# Better still: account for the fact that there are huge
# gaps in data from one year to another.
bl <- Tphap$Year - 47
look <- decluster(y, threshold=u, groups=bl)
look
plot(look)
# Now try the above with intervals declustering and compare
look2 <- decluster(y, threshold=u, method="intervals", groups=bl)
look2
dev.new()
plot(look2)
# Looks about the same,
# but note that the run length is estimated to be 5.
# Same resulting number of clusters, however.
# May result in different estimate of the extremal
# index.
#
fit <- fevd(look, threshold=u, type="GP", time.units="62/year")
fit
plot(fit)
# cf.
fit2 <- fevd(-MinT~1, Tphap, threshold=u, type="GP", time.units="62/year")
fit2
dev.new()
plot(fit2)
#
fit <- fevd(look, threshold=u, type="PP", time.units="62/year")
fit
plot(fit)
# cf.
fit2 <- fevd(-MinT~1, Tphap, threshold=u, type="PP", time.units="62/year")
fit2
dev.new()
plot(fit2)
## End(Not run)