R: Probabilities of Record Regression Test

p.regression.test {RecordTest}

R Documentation

Probabilities of Record Regression Test

Description

This function performs a linear hypothesis test based on a regression for the record probabilities p_t to study the hypothesis of the classical record model (i.e., of IID continuous RVs).

Usage

p.regression.test(
  X,
  record = c("upper", "lower"),
  formula = y ~ x,
  simulate.p.value = FALSE,
  B = 1000
)

Arguments

`X`	A numeric vector, matrix (or data frame).
`record`	A character string indicating the type of records to be calculated, "upper" or "lower".
`formula`	"`formula`" to use in `lm` function, e.g., `y ~ x`, `y ~ poly(x, 2, raw = TRUE)`, `y ~ log(x)`. By default `formula = y ~ x`. See Note for a caveat.
`simulate.p.value`	Logical. Indicates whether to compute p-values by Monte Carlo simulation. It is recommended if the number of columns of `X` (i.e., the number of series) is equal or lower than 10, since for low values the size of the test is not fulfilled.
`B`	If `simulate.p.value = TRUE`, an integer specifying the number of replicates used in the Monte Carlo estimation.

Details

The null hypothesis is that the data come from a population with independent and identically distributed realisations. This implies that in all the vectors (columns in matrix X), the sample probability of record at time t (p.record) is 1/t, so that

t \, \textrm{E}(\hat p_t) = 1.

Then,

H_0:\,p_t = 1/t, \, t=2, ..., T \iff H_0:\,\beta_0 = 1, \, \beta_1 = 0,

where \beta_0 and \beta_1 are the coefficients of the regression model

t \, \textrm{E}(\hat p_t) = \beta_0 + \beta_1 t.

The model has to be estimated by weighted least squares since the response is heteroskedastic.

Other models can be considered with the formula argument. However, for the test to be correct, the model must leave the intercept free or fix it to 1 (see Examples for possible models).

The F statistic is computed for carrying out a comparison between the restricted model under the null hypothesis and the more general model (e.g., the alterantive hypothesis where t \, \textrm{E}(\hat p_t) is a linear function of time t). This alternative hypothesis may be reasonable in many real examples, but not always.

If the sample size (i.e., the number of series or columns of X) is lower than 8 or 12 the simulate.p.value option is recommended.

Value

A "htest" object with elements:

`null.value`	Value of the coefficients under the null hypothesis when more than one coefficient is fitted.
`alternative`	Character string indicating the type of alternative hypothesis.
`method`	A character string indicating the type of test performed.
`estimate`	Value of the fitted coefficients.
`data.name`	A character string giving the name of the data.
`statistic`	Value of the `F` statistic.
`parameters`	Degrees of freedom of the `F` statistic.
`p.value`	P-value.

Note

IMPORTANT: In formula the intercept has to be free or fixed to 1 so that the test is correct.

Author(s)

Jorge Castillo-Mateo

References

Castillo-Mateo J, Cebrián AC, Asín J (2023). “RecordTest: An R Package to Analyze Non-Stationarity in the Extremes Based on Record-Breaking Events.” Journal of Statistical Software, 106(5), 1-28. doi:10.18637/jss.v106.i05.

Examples

# Simple test for upper records (p-value = 0.01202)
p.regression.test(ZaragozaSeries)
# Simple test for lower records (p-value = 0.006175)
p.regression.test(ZaragozaSeries, record = "lower")

# Fit a 2nd term polynomial for upper records (p-value = 0.0003933)
p.regression.test(ZaragozaSeries, formula = y ~ I(x^2))
# Fit a 2nd term polynomial for lower records (p-value = 0.005108)
p.regression.test(ZaragozaSeries, record = "lower", formula = y ~ I(x^2))

# Fix the intercept to 1 for upper records (p-value = 0.01416)
p.regression.test(ZaragozaSeries, formula = y ~ I(x-1) - 1 + offset(rep(1, length(x))))
# Fix the intercept to 1 for lower records (p-value = 0.00138)
p.regression.test(ZaragozaSeries, record = "lower", 
  formula = y ~ I(x-1) - 1 + offset(rep(1, length(x))))

# Simulate p-value when the number of series is small
TxZ <- apply(series_split(TX_Zaragoza$TX), 1, max, na.rm = TRUE)
p.regression.test(TxZ, simulate.p.value = TRUE)

[Package RecordTest version 2.2.0 Index]