RecordTest-package {RecordTest} | R Documentation |
RecordTest: A Package for Testing the Classical Record Model
Description
RecordTest provides data preparation, exploratory data analysis and inference tools based on theory of records to describe the record occurrence and detect trends, change-points or non-stationarities in the tails of the time series. Details about the implemented tools can be found in Castillo-Mateo, Cebrián and Asín (2023a, 2023b).
Details
The Classical Record Model:
Record statistics are used primarily to quantify the stochastic behaviour
of a process at never-seen-before values, either upper or lower. The setup
of independent and identically distributed (IID) continuous random
variables (RVs), often called the classical record model, is
particularly interesting because the common continuous distribution
underlying the IID continuous RVs will not affect the distribution of the
variables relative to the record occurrence. Many fields have begun to use
the theory of records to study these remarkable events. Particularly
productive is the study of record-breaking temperatures and their
connection with climate change, but also records in other environmental
fields (precipitations, floods, earthquakes, etc.), economy, biology,
physics or even sports have been analysed.
See Arnold, Balakrishnan and Nagaraja (1998) for an extensive theoretical
introduction to the theory of records and in particular the classical
record model. See Foster and Stuart (1954), Diersen and Trenkler (1996,
2001) and Cebrián, Castillo-Mateo and Asín (2022) for the
distribution-free trend detection tests, and Castillo-Mateo (2022) for the
distribution-free change-point detection tests based on the classical
record model. See Castillo-Mateo, Cebrián and Asín (2023b) for the version
as permutation tests. For an easy introduction to RecordTest use
vignette("RecordTest")
, and see Castillo-Mateo, Cebrián and Asín
(2023a).
This package provides tests to study the hypothesis of the classical record model, that is that the record occurrence from a series of values observed at regular time units come from an IID series of continuous RVs. If we have sequences of independent variables with no seasonal component, the hypothesis of IID variables is equivalent to test the hypothesis of homogeneity and stationarity.
The functions in the data preparation step:
The functions admit a vector X
corresponding to a single series as
an argument. However, some situations could take advantage of having
M
uncorrelated vectors to infer from the sample. Then, the input of
the functions to perform the statistical tools can be a matrix X
where each column corresponds to a vector formed by the values of a
series X_t
, for t=1,\ldots,T
, so that each row of the matrix
correspond to a time t
.
In many real problems, such as those related to environmental phenomena,
the series of variables to analyse show a seasonal behaviour, and only one
realisation is available. In order to be able to apply the suggested tools
to detect the existence of a trend, the seasonal component has to be
removed and a sample of M
uncorrelated series should be obtained.
Those problems can be solved by preparing the data adequately.
A wide set of tools to carry out a preliminary analysis and to prepare
data with a seasonal pattern are implemented in the following functions.
Note that the M
series can be dependent if the p-values are
approximated by permutations.
series_record
: If only the record times are available.
series_split
, series_double
: To split the
series in several subseries and remove the seasonal component and
autocorrelation.
series_uncor
: To extract a subset of uncorrelated subseries
series_ties
, series_untie
: To deal with record
ties.
series_rev
: To study the series backwards.
The functions to compute the record statistics are:
I.record
: Computes the observed record indicators. NA
values are taken as no records unless they appear at t = 1
.
N.record
, Nmean.record
: Compute the observed
number of records up to time t
.
S.record
: Computes the observed number of records at every
time t
, using M
series.
p.record
: Computes the estimated record probability at every
time t
, using M
series.
L.record
: Computes the observed record times.
R.record
: Computes the observed record values.
The functions to compute the tests:
All the tests performed are distribution-free/non-parametric tests in
time series for trend, change-point and non-stationarity in the extremes
of the distribution based on the null hypothesis that the record
indicators are independent and the probabilities of record at time t
are p_t = 1 / t
.
change.point
: Implements Castillo-Mateo change-point tests.
foster.test
: Implements Foster-Stuart and Diersen-Trenkler
trend tests.
N.test
: Implements tests based on the (weighted) number of
records.
brown.method
: Brown's method to combine dependent p-values
from N.test
.
fisher.method
: General function to apply Fisher's method to
independent p-values.
p.regression.test
: Implements a regression test based on the
record probabilities.
p.chisq.test
: Implements a \chi^2
-test based on the
record probabilities.
lr.test
: Implements likelihood ratio tests based on the
record indicators.
score.test
: Implements score or Lagrange multiplier
tests based on the record indicators.
The functions to compute the graphical tools:
records
: Shows the series remarking its records.
L.plot
: Shows record times in several series.
foster.plot
: Shows plots based on Foster-Stuart and
Diersen-Trenkler statistics.
N.plot
: Shows the (weighted) number of records.
p.plot
: Shows the record probabilities in different plots.
All the tests and graphical tools can be applied to both upper and lower records in the forward and backward directions.
Other functions:
rcrm
: Random generation for the classical record model.
dpoisbinom
, ppoisbinom
,
qpoisbinom
, rpoisbinom
: Density, distribution
function, quantile function and random generation for the Poisson binomial
distribution. Related to the probability distribution function of the
number of records under the null hypothesis.
Example datasets:
There are two example datasets included with this package. It is possible
to load these datasets into R using the data
function. The
datasets have their own help file, which can be accessed by
help([dataset_name])
.
Data included with RecordTest are:
TX_Zaragoza
- Daily maximum temperatures at Zaragoza
(Spain).
ZaragozaSeries
- Split and uncorrelated subseries
TX_Zaragoza$TX
.
Olympic_records_200m
- 200-meter Olympic records from 1900
to 2020.
To see how to cite RecordTest in publications or elsewhere,
use citation("RecordTest")
.
Author(s)
Jorge Castillo-Mateo <jorgecastillomateo@gmail.com>, AC Cebrián, J Asín
References
Arnold BC, Balakrishnan N, Nagaraja HN (1998). Records. Wiley Series in Probability and Statistics. Wiley, New York. doi:10.1002/9781118150412.
Castillo-Mateo J (2022). “Distribution-Free Changepoint Detection Tests Based on the Breaking of Records.” Environmental and Ecological Statistics, 29(3), 655-676. doi:10.1007/s10651-022-00539-2.
Castillo-Mateo J, Cebrián AC, Asín J (2023a).
“RecordTest: An R
Package to Analyze Non-Stationarity in the Extremes Based on Record-Breaking Events.”
Journal of Statistical Software, 106(5), 1-28.
doi:10.18637/jss.v106.i05.
Castillo-Mateo J, Cebrián AC, Asín J (2023b). “Statistical Analysis of Extreme and Record-Breaking Daily Maximum Temperatures in Peninsular Spain during 1960–2021.” Atmospheric Research, 293, 106934. doi:10.1016/j.atmosres.2023.106934.
Cebrián AC, Castillo-Mateo J, Asín J (2022). “Record Tests to Detect Non Stationarity in the Tails with an Application to Climate Change.” Stochastic Environmental Research and Risk Assessment, 36(2), 313-330. doi:10.1007/s00477-021-02122-w.
Diersen J, Trenkler G (1996). “Records Tests for Trend in Location.” Statistics, 28(1), 1-12. doi:10.1080/02331889708802543.
Diersen J, Trenkler G (2001). “Weighted Records Tests for Splitted Series of Observations.” In J Kunert, G Trenkler (eds.), Mathematical Statistics with Applications in Biometry: Festschrift in Honour of Prof. Dr. Siegfried Schach, pp. 163–178. Lohmar: Josef Eul Verlag.
Foster FG, Stuart A (1954). “Distribution-Free Tests in Time-Series Based on the Breaking of Records.” Journal of the Royal Statistical Society B, 16(1), 1-22. doi:10.1111/j.2517-6161.1954.tb00143.x.