bentcable.ar {bentcableAR} | R Documentation |
Bent-Cable Regression for Independent and Autoregressive Data
Description
These two functions are the main interfaces in the
bentcableAR
package. They perform bent-cable (including
broken-stick) regression to AR(p) time-series data or independent
data (time-series or otherwise) and produce diagnostic plots.
Confidence intervals for the critical time point (CTP) are
included in some cases.
Usage
bentcable.ar(y.vect, tgdev = NULL, p = 0, stick = FALSE, t.vect = NULL,
init.cable = NULL, init.phi = NULL, tol = 1e-04,
method0 = "css", method1 = "yw", ci.level = 0.95,
main = NULL)
bentcable.dev.plot(tau.vect, gamma.vect = NULL, y.vect, t.vect = NULL,
stick = FALSE, p = 0)
Arguments
y.vect |
A numeric vector of response data. |
t.vect |
A numeric vector of design points, which MUST
be equidistant with unit increments if p>0 is assumed. They need
not be equidistant for independent data. Specifying
|
tau.vect , gamma.vect |
Numeric vectors specifying a
|
tgdev |
A |
p |
The autoregressive order (non-negative integer).
|
stick |
A logical value; if |
init.cable |
A numeric vector of initial values for the
bent-cable parameters. If |
init.phi |
A numeric vector of initial values for the AR
coefficients. If not provided, then a default value is assigned,
consisting of the first |
tol |
Tolerance for determining convergence. |
method0 , method1 |
The fitting method when p>0. |
ci.level |
A numeric value between 0 and 1, exclusive. Used to
compute the CTP confidence interval when |
main |
A title for the set of diagnostic plots. |
Details
bentcable.dev.plot
involves bent-cable regression assuming a
known transition. It plots a profile deviance surface over a fixed
grid (see References). It also returns the grid and the
profile deviance surface matrix, which can be used to generate
initial values for an overall bent-cable regression (no known
parameters).
bentcable.ar
is used mainly for overall bent-cable
regression, with one exception. Different scenarios
determine the behaviour of bentcable.ar
, as follows.
(1) Independent data and tgdev
is supplied. In this case,
bentcable.ar
calls cable.ar.0.fit
which identifies
the best grid-based fit from tgdev
, then feeds it through an
internal engine cable.ar.p.iter
or stick.ar.0
that
performs overall bent-cable regression. This best fit is returned
but not plotted, and the autocorrelation is diagnosed (even for
non-time-series data) by a PACF plot and a suggested value of p
based on the AIC (see ar
). As stated in the screen
output, these diagnostics should be used only for time-series data,
where the returned best AR(0) estimates are intended to be supplied
as init.cable
in a subsequent call of bentcable.ar
for an AR(p>0) fit. To produce a plot of the returned
best AR(0) fit and/or the corresponding CTP confidence interval,
the user can supply the returned parameter estimates as
init.cable
in another call of bentcable.ar
with
p=0
(see Scenario (3)).
(2) AR(p>0) data and tgdev
is supplied. In this case, no
graphics are produced; bentcable.ar
simply locates the
highest point on the grid-based profile deviance surface and
returns the corresponding (crude) parameter estimates to be used as
init.cable
and init.phi
in subsequent overall
bent-cable fits. If multiple peaks exist (such as along a ridge),
then only that at the smallest \tau
and smallest \gamma
is used.
(3) Independent data (time series or otherwise) and
init.cable
are supplied. In this case, bentcable.ar
performs overall bent-cable regression and produces a
scatterplot of the data superimposed with the best fit and
estimated transition. For time series data where the CTP is
applicable (see Warnings), the CTP confidence interval is
additionally computed and superimposed in blue. No other plots are
produced. Since init.cable
is supposed to have come from a
reasonable source (such as grid-based), this fit is not intended to
be fed to another round of bentcable.ar
, except when the
user wishes to explore using a positive p (but this should be
performed in conjunction with another round of grid-based approach
in Scenario (2)).
(4) AR(p>0) data and init.cable
are supplied. In this
case, bentcacble.ar
computes the overall bent-cable fit and
CTP confidence interval (see cable.change.conf
). Also
included are the following diagnostics: a scatterplot of the data
superimposed with the best fit and estimated transition
(\tau-\gamma,\tau,\tau+\gamma)
(in red) and the CTP
confidence interval (in blue, if it exists - see Warnings),
and ACF and PACF plots for the fitted residuals and innovations
(see cable.ar.p.resid
for their difference). Since
init.cable
is supposed to have come from a reasonable
source (such as grid-based), this fit is not intended to be fed
to another round of bentcable.ar
, except when the user
wishes to explore using an alternative p (but this should be
performed in conjunction with another round of grid-based approach
in Scenario (1) or (2)), or when the "css"
algorithm fails
to converge but the SSE value is desired (see Details).
Below is a summary of the bent-cable regression methodology, and
how one may apply it by using the bentcableAR
package.
The bent cable is a linear-quadratic-linear function, where
the quadratic bend is regarded as the transition from the incoming
linear phase to the outgoing linear phase. A bent cable has the form
f(t) = b_0 + b_1 t + b_2 q(t)
, where q(t)
is the
basic bent cable with incoming slope 0 and outgoing slope 1,
and a quadratic bend that is centred at \tau
with half-width
\gamma\ge 0
:
q(t)=\frac{(t-\tau+\gamma)^2}{4\gamma} I\{|t-\tau|\le\gamma\}
+ (t-\tau) I\{t>\tau+\gamma\}.
The broken stick is a special bent cable with no quadratic
bend (i.e. \gamma
=0). The term bent-cable regression
implicitly includes broken-stick regression.
For independent data (time series or otherwise), bent-cable
regression by maximum likelihood is performed via nonlinear
least-squares estimation of \theta=(b_0,b_1,b_2,\tau,\gamma)
.
For AR(p) data, the AR coefficients are
\phi=(\phi_1,\phi_2,\ldots,\phi_p)
, and conditional maximum
likelihood (CML) estimation of (\theta,\phi)
(conditioned on
the first p data points) is performed by nonlinear conditional
least squares (i.e. minimizing the conditional sum-of-squares error
(SSE)). In this time-series context, time points are assumed to be
equidistant with unit increments.
Minimization of the (conditional) SSE is specified as "css"
by default for method0
. However, "css"
sometimes
fails to converge, or the resulting \phi
estimate sometimes
corresponds to non-stationarity. In this case, the alternative
estimation approach specified for method1
is attempted.
"mle"
specifies the CML-ML hybrid algorithm, and
"yw"
the CML-ML-MM hybrid algorithm (MM stands
for method of moments; see References.) Both
"yw"
and "mle"
guarantee stationarity, but often take
much longer than "css"
to converge.
Due to nonlinearity, initial values must be supplied for proper
parameter estimation. Also, bent-cable regression is a notoriously
irregular estimation problem (due to low-order differentiability),
and the estimation algorithms (mainly the built-in R
functions nls
and optim
) may fail to converge from
initial values that are unrefined guesses of the parameters. When
this happens, the user is advised to generate an initial value from
a grid-based procedure.
The grid-based procedure involves specifying a
(\tau,\gamma)
-grid over which the bent-cable profile deviance
surface is evaluated and plotted, such as by
bentcable.dev.plot
. At each grid
point, the transition is fixed, and bent-cable regression involves
only linear parameters b_0, b_1, b_2
and AR coefficients
\phi
, all of which can be estimated using standard
time-series algorithms (mainly the built-in R functions ar
and arima
). Regression at each grid point yields a point on
the profile deviance surface. The grid point at which the profile
deviance is maximum corresponds to a bent-cable fit (given a known
transition) that is best among the specified grid points. Thus, for
a high-resolution grid, this best grid point together with
the corresponding estimates of b_0, b_1, b_2
and \phi
may be regarded as the ML or CML estimate for the model. However,
high-resolution grid-based estimation may be computationally
infeasible. Instead, the best grid point on a coarser grid
can give good initial values for the true ML or CML estimate that
is trapped between grid points.
However, the true ML or CML estimate may not easily come by
even with good initial values. Irregularity of bent-cable
regression often manifests itself in the form of multiple peaks on
the deviance surface. Thus, the user should be aware of different
local maxima on which the optimization algorithm can converge
despite initial values for \theta
that are very similar. The
user is advised to combine several exploratory analyses as well as
model diagnoses before settling on a best fit.
For example, one may first fix p=0 as the AR order, then use
bentcable.dev.plot
to conduct a visual inspection of the
profile deviance surface over a fine (\tau,\gamma)
-grid. This
is to identify the neighbourhood of the global maximum for p=0. If
necessary, one can zoom in to this neighbourhood by placing
over it an even finer grid to hone the grid-based approximation.
The resulting bentcable.dev.plot
object can then be fed to
bentcable.ar
to produce a best overall fit for the AR(0)
assumption in that neighbourhood. If p=0 is deemed inadequate based
on the bentcable.ar
diagnostics, then the regression must
now be repeated for a newly chosen p. Since the bent-cable
parameter estimates will differ for different values of p, the
earlier AR(0) estimates may or may not be good initial values for
this new AR(p) fit. The user is advised to try several additional
initial values, possibly repeating the grid-based procedure, but
this time using the new p. To further screen out local maxima, the
SSE values for these AR(p) fits (with common p) should be compared.
For a "css"
fit, the SSE is stored in
$cable$ar.p.fit$value
of the returned object. The SSE is not
directly retrievable for a "yw"
or "mle"
fit, but the
user can apply the estimates returned in $cable$est
as the initial
values to a subsequent "yw"
fit, and the SSE will appear in
the screen output as initial value while the "css"
algorithm iterates.
As with any numerical optimization procedure, there is no guarantee that the fit observed to have the smallest SSE value indeed corresponds to the global maximum.
Value
cable |
An object that is compatible with a
|
ctp |
A |
fit |
Returned by |
init |
Returned by |
y , t , n , p , stick |
Returned by |
dev , tau , gamma |
Returned by |
Warnings
For time-series data, t.vect
MUST be
equidistant with unit increments; otherwise, these
functions will return meaningless values. (For
independent data, t.vect
can be non-equidistant.)
Computations for the CTP estimate and confidence interval are based on a time
vector of the form c(0,1,2,...)
. For any other form for the time
vector, the CTP will not be computed, and on-screen warnings
will appear. To ensure compatibility between the model fit and CTP
estimates, the user is advised to fit the model using the default
time vector. Then, if necessary, the user may transform the results
to the preferred time scale after the model and CTP estimates have
been produced.
The above computational issue implies that the CTP cannot
be computed for non-time-series data. Rationale:
In a non-time-series context design points are often
non-equidistant, and the cable's slope often never changes sign;
even with a sign change, the point at which this takes place may be
less interpretable. In such a context, the user is advised to rely
on confidence regions for (\tau,\gamma)
(see
References).
Note
The major engines for bentcable.dev.plot
are
cable.dev
and cable.fit.known.change
.
The computational engines for bentcable.ar
are
cable.ar.p.iter
, cable.ar.0.fit
,
stick.ar.0
, and cable.change.conf
,
while the plotting engine is cable.ar.p.diag
.
Although these and other lesser functions are called
internally by the two main interfaces described here, they can be
used as stand-alone functions, and the user is advised to
refer to their documentation. Type
library(help="bentcableAR")
for a full list of available
functions.
Author(s)
Grace Chiu
References
Chiu, G.S. and Lockhart, R.A. (2010), Bent-Cable Regression with Autoregressive Noise, Canadian Journal of Statistics, 38, 386–407. DOI: 10.1002/cjs.10070. URL: https://www.researchgate.net/publication/227652258_Bent-cable_regression_with_autoregressive_noise
Chiu, G., Lockhart, R. and Routledge, R. (2006), Bent-Cable Regression Theory and Applications, Journal of the American Statistical Association, 101, 542–553. DOI: 10.1198/016214505000001177. URL: https://www.researchgate.net/publication/4742466_Bent-Cable_Regression_Theory_and_Applications
See Also
cable.lines
, lm
, nls
, optim
,
ar
, arima
, plot
,
par
, contour
, persp
Examples
## Not run:
# Scenario (1)
##############
# independent non-time-series cable:
data(stagnant)
bentcable.dev.plot( seq(-1,1,length=20),
seq(.1,1,length=20), stagnant$loght, stagnant$logflow )
# zoom in to global max
dev0 <- bentcable.dev.plot( seq(-.04,.16,length=20),
seq(.2,.65,length=20), stagnant$loght, stagnant$logflow )
# locally smooth deviance surface
cable <- bentcable.ar( stagnant$loght, tgdev=dev0, t.vect=stagnant$logflow )
# ignore time-series diagnostics
# local regularity - expect to be true best fit
# SSE=0.005
# feed 'cable' in Scenario (3) to get fitted plot:
# bentcable.ar( cable$y, init.cable=coef(cable$fit),
# t.vect=cable$t )
# AR(0) stick, start time at 80:
dev0 <- bentcable.dev.plot( seq(85,97,length=15), 0,
sockeye$logReturns, sockeye$year, TRUE ) # obvious global max
stick0 <- bentcable.ar( sockeye$logReturns, tgdev=dev0, stick=TRUE,
t.vect=sockeye$year )
# local regularity - should be true best fit
# SSE=8.85
# diagnostics: take p=0 to 4 ??
# AR(0) cable, start at time 0:
bentcable.dev.plot( seq(1,20,length=25),
seq(.1,15,length=25), sockeye$logReturns )
# zoom in to global max
dev0 <- bentcable.dev.plot( seq(10,15,length=25),
seq(2,10,length=20), sockeye$logReturns )
# surface has ridge - expect some trouble locating true peak
cable0 <- bentcable.ar( sockeye$logReturns, tgdev=dev0 )
# apparent best AR(0) fit: SSE=8.68
# diagnostics: take p=2 to 6
# compare to this:
# dev1 <- bentcable.dev.plot( seq(10,15,length=25),
# seq(2,10,length=15), sockeye$logReturns )
# bentcable.ar( sockeye$logReturns, tgdev=dev1 ) # SSE=8.683
# # not an obvious local max!
# feed 'cable0' in Scenario (3) to get fitted plot:
# bentcable.ar( cable0$y, init.cable=coef(cable0$fit) )
## End(Not run)
# Scenario (2)
##############
data(sockeye)
# AR(2) cable, start time at 0:
bentcable.dev.plot( seq(6,18,length=15),
seq(.01,12,length=15), sockeye$logReturns, p=2 )
# zoom in to global max
dev2 <- bentcable.dev.plot( seq(10,12,length=15),
seq(1,5,length=15), sockeye$logReturns, p=2 )
# best grid-based fit
gr.cable2 <- bentcable.ar( sockeye$logReturns, tgdev=dev2, p=2 )
# to be used in Scenario (4)
# local regularity - expect little trouble
# AR(2) stick, start time at 80:
bentcable.dev.plot( seq(86,98,length=15), y.vect=sockeye$logReturns,
p=2, stick=TRUE, t.vect=sockeye$year )
# zoom in to global max
dev3 <- bentcable.dev.plot( seq(88.5,93,length=25),
y.vect=sockeye$logReturns,
p=2, stick=TRUE, t.vect=sockeye$year )
# camel hump - double peaks!
# best grid-based fit
gr.stick2 <- bentcable.ar( sockeye$logReturns, tgdev=dev3, p=2, stick=TRUE,
t.vect=sockeye$year )
# irregularity - expect some trouble if used in Scenario (4)
## Not run:
# AR(4) cable, start time at 0:
bentcable.dev.plot( seq(6,18,length=15), seq(.01,12,length=15),
sockeye$logReturns, p=4 )
# zoom in to global max
dev4 <- bentcable.dev.plot( seq(10,12,length=15),
seq(1,7,length=25), sockeye$logReturns, p=4 )
# slight ridge
# best grid-based fit
gr.cable4 <- bentcable.ar( sockeye$logReturns, tgdev=dev4, p=4 )
# to be used in Scenario (4)
# will ridge be problem???
# Scenario (3)
##############
# independent non-time-series cable:
data(stagnant)
bentcable.ar( stagnant$loght, t.vect=stagnant$logflow,
init.cable=c(.6,-.4,-.7,0,.5) ) # SSE=0.005
# identical to 'cable' in Scenario (1)
# no irregularity, no ambiguity!
# AR(0) stick, start time at 80:
bentcable.ar( sockeye$logReturns, init.cable=c(10,.1,-.5,90),
stick=TRUE, t.vect=sockeye$year )
# identical to 'stick0' in Scenario (1)
# local regularity, no trouble
# AR(0) stick, start time at 0:
bentcable.ar( sockeye$logReturns, init.cable=coef(cable0$fit)[1:5],
stick=TRUE )
# identical to 'cable0' in Scenario (1)
# here you get plot of fit and CTP confidence interval
## End(Not run)
# Scenario (4)
##############
# AR(2) cable, start time at 0:
# use 'gr.cable2' from Scenario (2)
cable2 <- bentcable.ar( sockeye$logReturns,
init.cable=gr.cable2$init[1:5], init.phi=gr.cable2$init[-c(1:5)] )
# "css" successful
# best AR(2) fit, SSE=4.868
# compare to this:
# bentcable.ar( sockeye$logReturns,
# init.cable=c(13,.1,-.5,11,4), p=2 )
# "css" successful, same SSE, virtually same fit
# recall local regularity from 'dev2'
# AR(2) stick, start time at 80:
# use 'gr.stick2' from Scenario (2)
stick2 <- bentcable.ar( sockeye$logReturns, init.cable=gr.stick2$init[1:4],
init.phi=gr.stick2$init[-c(1:4)], stick=TRUE, t.vect=sockeye$year )
# "css" successful, best AR(2) fit, SSE=5.0
# compare this to the other peak shown in 'dev3'
# bentcable.ar( sockeye$logReturns,
# init.cable=c(10,0,-.5,91.5), p=2, stick=TRUE,
# t.vect=sockeye$year )
# "css" successful, SSE=5.1, not best fit!
## Not run:
# AR(4) cable, start time at 0:
cable4 <- bentcable.ar( sockeye$logReturns,
init.cable=gr.cable4$init[1:5], init.phi=gr.cable4$init[-c(1:5)] )
# "css" unsuccessful, switched to "yw"
# feed 'cable4' in Scenario (4) to get SSE from screen output:
bentcable.ar( cable4$cable$y, init.cable=cable4$cable$est[1:5],
init.phi=cable4$cable$est[-c(1:5)] )
# SSE=2.47 from screen output
## End(Not run)