change.point {RecordTest} | R Documentation |
Change-point Detection Tests Based on Records
Description
Performs change-point detection tests based on the record occurrence. The hypothesis of the classical record model (i.e., of IID continuous RVs) is tested against the alternative hypothesis that after a certain time the series stops being IID.
Usage
change.point(
X,
weights = function(t) 1,
record = c("upper", "lower", "d", "s"),
correct = c("none", "fisher", "vrbik"),
permutation.test = FALSE,
simulate.p.value = FALSE,
B = 1000
)
Arguments
X |
A numeric vector, matrix (or data frame). |
weights |
A function indicating the weight given to the different
records according to their position in the series. Castillo-Mateo (2022)
showed that the weights that get more power for this test are
|
record |
A character string that indicates the type of statistic used.
The statistic with |
correct |
A character string that indicates the continuity correction
in the Kolmogorov distribution made to the statistic: "fisher" (Fisher and
Robbins 2019), "vrbik" (Vrbik 2020) or "none" (the default) if no
correction is made. The former shows better size and power, but if the
value of the statistic is too large it becomes |
permutation.test |
Logical. Indicates whether to compute p-values by
permutation simulation (Castillo-Mateo et al. 2023). It does not require
that the columns of |
simulate.p.value |
Logical. Indicates whether to compute p-values by
Monte Carlo simulation. If |
B |
If |
Details
The test is implemented as given by Castillo-Mateo (2022). The null hypothesis is that
H_0: p_t = 1/t, \qquad t=1,\ldots,T,
where p_t
is the probability of (upper and/or lower) record at time
t
. The two-sided alternative hypothesis is that
H_1: p_t = 1/t, \quad t=1,\ldots,t_0, \qquad p_t \neq 1/t, \quad t=t_0+1,\ldots,T,
for a change-point t_0
.
The variables used for the statistic are
K^{\omega}_T = \max_{1\le t \le T} \left| \frac{N_{t}^{\omega} - \textrm{E}(N_{t}^{\omega})}{\sqrt{\textrm{VAR}(N_{T}^{\omega})}} - \frac{\textrm{VAR}(N_{t}^{\omega})}{\textrm{VAR}(N_{T}^{\omega})} \frac{N_{T}^{\omega} - \textrm{E}(N_{T}^{\omega})}{\sqrt{\textrm{VAR}(N_{T}^{\omega})}} \right|,
where N_{t}^\omega = \sum_{m=1}^M \sum_{j=1}^t \omega_j I_{jm}
, and
the estimated change-point \hat{t}_0
is the value t
where
K^{\omega}_T
attains its maximum.
Argument record
indicates if the I_{tm}
's are the
"upper"
or "lower"
record indicators (see
I.record
). If record = "d"
or = "s"
,
N_{t}^\omega
is substituted in the expressions above by
d_{t}^{\omega,(F)} = N_{t}^{\omega,(FU)} - N_{t}^{\omega,(FL)}
or
s_{t}^{\omega,(F)} = N_{t}^{\omega,(FU)} + N_{t}^{\omega,(FL)}
,
respectively.
The p-value is calculated by means of the asymptotic Kolmogorov
distribution. When \omega_t \neq 1
, the asymptotic result is not
fulfilled. In that case, the p-value should be simulated using
permutation or Monte Carlo simulations with the option
permutation.test = TRUE
or simulate.p.value = TRUE
,
respectively. Permutations is the only method of calculating p-values that
does not require that the columns of X
be independent.
As the Kolmogorov distribution is an asymptotic result, it has been seen that the size and power may be a little below than expected, to correct this, any of the continuity corrections can be used:
If correct = "fisher"
,
K_T = - \sqrt{T} \log\left(1 - \frac{K_T}{\sqrt{T}}\right).
If correct = "vrbik"
,
K_T = K_T + \frac{1}{6\sqrt{T}} + \frac{K_T - 1}{4T}.
Value
A "htest"
object with elements:
statistic |
Value of the test statistic. |
p.value |
P-value. |
alternative |
The alternative hypothesis. |
estimate |
The estimated change-point time. |
method |
A character string indicating the type of test performed. |
data.name |
A character string giving the name of the data. |
Author(s)
Jorge Castillo-Mateo
References
Castillo-Mateo J (2022). “Distribution-Free Changepoint Detection Tests Based on the Breaking of Records.” Environmental and Ecological Statistics, 29(3), 655-676. doi:10.1007/s10651-022-00539-2.
Castillo-Mateo J, Cebrián AC, Asín J (2023). “Statistical Analysis of Extreme and Record-Breaking Daily Maximum Temperatures in Peninsular Spain during 1960–2021.” Atmospheric Research, 293, 106934. doi:10.1016/j.atmosres.2023.106934.
Fisher TJ, Robbins MW (2019). “A Cheap Trick to Improve the Power of a Conservative Hypothesis Test.” The American Statistician, 73(3), 232-242. doi:10.1080/00031305.2017.1395364.
Vrbik J (2020). “Deriving CDF of Kolmogorov-Smirnov Test Statistic.” Applied Mathematics, 11(3), 227-246. doi:10.4236/am.2020.113018.
See Also
Examples
change.point(ZaragozaSeries)
change.point(series_split(TX_Zaragoza$TX), record = "d",
weights = function(t) sqrt(t),
permutation.test = TRUE, B = 50)
change.point(ZaragozaSeries, record = "d",
weights = function(t) sqrt(t), simulate.p.value = TRUE)
test.result <- change.point(rowMeans(ZaragozaSeries))
test.result
## Not run: Load package ggplot2 to plot the changepoint
#library("ggplot2")
#records(rowMeans(ZaragozaSeries)) +
# ggplot2::geom_vline(xintercept = test.result$estimate, colour = "red")