R: *RecordTest*: A Package for Testing the Classical Record...

RecordTest-package {RecordTest}

R Documentation

RecordTest: A Package for Testing the Classical Record Model

Description

RecordTest provides data preparation, exploratory data analysis and inference tools based on theory of records to describe the record occurrence and detect trends, change-points or non-stationarities in the tails of the time series. Details about the implemented tools can be found in Castillo-Mateo, Cebrián and Asín (2023a, 2023b).

Details

The Classical Record Model:

Record statistics are used primarily to quantify the stochastic behaviour of a process at never-seen-before values, either upper or lower. The setup of independent and identically distributed (IID) continuous random variables (RVs), often called the classical record model, is particularly interesting because the common continuous distribution underlying the IID continuous RVs will not affect the distribution of the variables relative to the record occurrence. Many fields have begun to use the theory of records to study these remarkable events. Particularly productive is the study of record-breaking temperatures and their connection with climate change, but also records in other environmental fields (precipitations, floods, earthquakes, etc.), economy, biology, physics or even sports have been analysed. See Arnold, Balakrishnan and Nagaraja (1998) for an extensive theoretical introduction to the theory of records and in particular the classical record model. See Foster and Stuart (1954), Diersen and Trenkler (1996, 2001) and Cebrián, Castillo-Mateo and Asín (2022) for the distribution-free trend detection tests, and Castillo-Mateo (2022) for the distribution-free change-point detection tests based on the classical record model. See Castillo-Mateo, Cebrián and Asín (2023b) for the version as permutation tests. For an easy introduction to RecordTest use vignette("RecordTest"), and see Castillo-Mateo, Cebrián and Asín (2023a).

This package provides tests to study the hypothesis of the classical record model, that is that the record occurrence from a series of values observed at regular time units come from an IID series of continuous RVs. If we have sequences of independent variables with no seasonal component, the hypothesis of IID variables is equivalent to test the hypothesis of homogeneity and stationarity.

The functions in the data preparation step:

The functions admit a vector X corresponding to a single series as an argument. However, some situations could take advantage of having M uncorrelated vectors to infer from the sample. Then, the input of the functions to perform the statistical tools can be a matrix X where each column corresponds to a vector formed by the values of a series X_t, for t=1,\ldots,T, so that each row of the matrix correspond to a time t.

In many real problems, such as those related to environmental phenomena, the series of variables to analyse show a seasonal behaviour, and only one realisation is available. In order to be able to apply the suggested tools to detect the existence of a trend, the seasonal component has to be removed and a sample of M uncorrelated series should be obtained. Those problems can be solved by preparing the data adequately. A wide set of tools to carry out a preliminary analysis and to prepare data with a seasonal pattern are implemented in the following functions. Note that the M series can be dependent if the p-values are approximated by permutations.

series_record: If only the record times are available.

series_split, series_double: To split the series in several subseries and remove the seasonal component and autocorrelation.

series_uncor: To extract a subset of uncorrelated subseries

series_ties, series_untie: To deal with record ties.

series_rev: To study the series backwards.

The functions to compute the record statistics are:

I.record: Computes the observed record indicators. NA values are taken as no records unless they appear at t = 1.

N.record, Nmean.record: Compute the observed number of records up to time t.

S.record: Computes the observed number of records at every time t, using M series.

p.record: Computes the estimated record probability at every time t, using M series.

L.record: Computes the observed record times.

R.record: Computes the observed record values.

The functions to compute the tests:

All the tests performed are distribution-free/non-parametric tests in time series for trend, change-point and non-stationarity in the extremes of the distribution based on the null hypothesis that the record indicators are independent and the probabilities of record at time t are p_t = 1 / t.

change.point: Implements Castillo-Mateo change-point tests.

foster.test: Implements Foster-Stuart and Diersen-Trenkler trend tests.

N.test: Implements tests based on the (weighted) number of records.

brown.method: Brown's method to combine dependent p-values from N.test.

fisher.method: General function to apply Fisher's method to independent p-values.

p.regression.test: Implements a regression test based on the record probabilities.

p.chisq.test: Implements a \chi^2-test based on the record probabilities.

lr.test: Implements likelihood ratio tests based on the record indicators.

score.test: Implements score or Lagrange multiplier tests based on the record indicators.

The functions to compute the graphical tools:

records: Shows the series remarking its records.

L.plot: Shows record times in several series.

foster.plot: Shows plots based on Foster-Stuart and Diersen-Trenkler statistics.

N.plot: Shows the (weighted) number of records.

p.plot: Shows the record probabilities in different plots.

All the tests and graphical tools can be applied to both upper and lower records in the forward and backward directions.

Other functions:

rcrm: Random generation for the classical record model.

dpoisbinom, ppoisbinom, qpoisbinom, rpoisbinom: Density, distribution function, quantile function and random generation for the Poisson binomial distribution. Related to the probability distribution function of the number of records under the null hypothesis.

Example datasets:

There are two example datasets included with this package. It is possible to load these datasets into R using the data function. The datasets have their own help file, which can be accessed by help([dataset_name]). Data included with RecordTest are:

TX_Zaragoza - Daily maximum temperatures at Zaragoza (Spain).

ZaragozaSeries - Split and uncorrelated subseries TX_Zaragoza$TX.

Olympic_records_200m - 200-meter Olympic records from 1900 to 2020.

To see how to cite RecordTest in publications or elsewhere, use citation("RecordTest").

Author(s)

Jorge Castillo-Mateo <jorgecastillomateo@gmail.com>, AC Cebrián, J Asín

References

Arnold BC, Balakrishnan N, Nagaraja HN (1998). Records. Wiley Series in Probability and Statistics. Wiley, New York. doi:10.1002/9781118150412.

Castillo-Mateo J (2022). “Distribution-Free Changepoint Detection Tests Based on the Breaking of Records.” Environmental and Ecological Statistics, 29(3), 655-676. doi:10.1007/s10651-022-00539-2.

Castillo-Mateo J, Cebrián AC, Asín J (2023a). “RecordTest: An R Package to Analyze Non-Stationarity in the Extremes Based on Record-Breaking Events.” Journal of Statistical Software, 106(5), 1-28. doi:10.18637/jss.v106.i05.

Castillo-Mateo J, Cebrián AC, Asín J (2023b). “Statistical Analysis of Extreme and Record-Breaking Daily Maximum Temperatures in Peninsular Spain during 1960–2021.” Atmospheric Research, 293, 106934. doi:10.1016/j.atmosres.2023.106934.

Cebrián AC, Castillo-Mateo J, Asín J (2022). “Record Tests to Detect Non Stationarity in the Tails with an Application to Climate Change.” Stochastic Environmental Research and Risk Assessment, 36(2), 313-330. doi:10.1007/s00477-021-02122-w.

Diersen J, Trenkler G (1996). “Records Tests for Trend in Location.” Statistics, 28(1), 1-12. doi:10.1080/02331889708802543.

Diersen J, Trenkler G (2001). “Weighted Records Tests for Splitted Series of Observations.” In J Kunert, G Trenkler (eds.), Mathematical Statistics with Applications in Biometry: Festschrift in Honour of Prof. Dr. Siegfried Schach, pp. 163–178. Lohmar: Josef Eul Verlag.

Foster FG, Stuart A (1954). “Distribution-Free Tests in Time-Series Based on the Breaking of Records.” Journal of the Royal Statistical Society B, 16(1), 1-22. doi:10.1111/j.2517-6161.1954.tb00143.x.

[Package RecordTest version 2.2.0 Index]