arfima {arfima} | R Documentation |
Fit ARFIMA, ARIMA-FGN, and ARIMA-PLA (multi-start) models Fits ARFIMA/ARIMA-FGN/ARIMA-PLA multi-start models to times series data. Options include fixing parameters, whether or not to fit fractional noise, what type of fractional noise (fractional Gaussian noise (FGN), fractionally differenced white noise (FDWN), or the newly introduced power-law autocovariance noise (PLA)), etc. This function can fit regressions with ARFIMA/ARIMA-FGN/ARIMA-PLA errors via the xreg argument, including dynamic regression (transfer functions).
Description
Fits by direct optimization using optim. The optimizer choices are: 0 - BFGS; 1 - Nealder-Mead; 2 - SANN; otherwise CG.
Usage
arfima(
z,
order = c(0, 0, 0),
numeach = c(1, 1),
dmean = TRUE,
whichopt = 0,
itmean = FALSE,
fixed = list(phi = NA, theta = NA, frac = NA, seasonal = list(phi = NA, theta = NA,
frac = NA), reg = NA),
lmodel = c("d", "g", "h", "n"),
seasonal = list(order = c(0, 0, 0), period = NA, lmodel = c("d", "g", "h", "n"),
numeach = c(1, 1)),
useC = 3,
cpus = 1,
rand = FALSE,
numrand = NULL,
seed = NA,
eps3 = 0.01,
xreg = NULL,
reglist = list(regpar = NA, minn = -10, maxx = 10, numeach = 1),
check = F,
autoweed = TRUE,
weedeps = 0.01,
adapt = TRUE,
weedtype = c("A", "P", "B"),
weedp = 2,
quiet = FALSE,
startfit = NULL,
back = FALSE
)
Arguments
z |
The data set (time series) |
order |
The order of the ARIMA model to be fit: c(p, d, q). We have
that p is the number of AR parameters (phi), d is the amount of integer
differencing, and q is the number of MA parameters (theta). Note we use the
Box-Jenkins convention for the MA parameters, in that they are the negative
of |
numeach |
The number of starts to fit for each parameter. The first argument in the vector is the number of starts for each AR/MA parameter, while the second is the number of starts for the fractional parameter. When this is set to 0, no fractional noise is fit. Note that the number of starts in total is multiplicative: if we are fitting an ARFIMA(2, d, 2), and use the older number of starts (c(2, 2)), we will have 2^2 * 2 * 2^2 = 32 starting values for the fits. Note that the default has changed from c(2, 2) to c(1, 1) since package version 1.4-0 |
dmean |
Whether the mean should be fit dynamically with the optimizer. Note that the likelihood surface will change if this is TRUE, but this is usually not worrisome. See the referenced thesis for details. |
whichopt |
Which optimizer to use in the optimization: see "Details". |
itmean |
This option is under investigation, and will be set to FALSE automatically until it has been decided what to do. Whether the mean should be fit iteratively using the function
|
fixed |
A list of parameters to be fixed. If we are to fix certain
elements of the AR process, for example, fixed$phi must have length equal to
p. Any numeric value will fix the parameter at that value; for example, if
we are modelling an AR(2) process, and we wish to fix only the first
autoregressive parameter to 0, we would have fixed = list(phi = c(0, NA)).
NA corresponds to that parameter being allowed to change in the optimization
process. We can fix the fractional parameters, and unlike
|
lmodel |
The long memory model (noise type) to be used: "d" for FDWN, "g" for FGN, "h" for PLA, and "n" for none (i.e. ARMA short memory models). Default is "d". |
seasonal |
The seasonal components of the model we wish to fit, with the same components as above. The period must be supplied. |
useC |
How much interfaced C code to use: an integer between 0 and 3. The value 3 is strongly recommended. See "Details". |
cpus |
The number of CPUs used to perform the multi-start fits. A small number of fits and a high number of cpus (say both equal 4) with n not large can actually be slower than when cpus = 1. The number of CPUs should not exceed the number of threads available to R. |
rand |
Whether random starts are used in the multistart method. Defaults to FALSE. |
numrand |
The number of random starts to use. |
seed |
The seed for the random starts. |
eps3 |
How far to start from the boundaries when using a grid for the multi-starts (i.e. when rand is FALSE.) |
xreg |
A matrix, data frame, or vector of regressors for regression or transfer functions. |
reglist |
A list with the following elements:
|
check |
If TRUE, checks at each optim iteration whether the model is identifiable. This makes the optimization much slower. |
autoweed |
Whether to automatically (before the fit is returned) weed out modes found that are found that are close together (usually the same point.) |
weedeps |
The maximum distance between modes that are close together for the mode with the lower log-likelihood to be weeded out. If adapt is TRUE (default) this value changes. |
adapt |
If TRUE, if dim is the dimensionality of the search, weedeps is
changed to |
weedtype |
The type of weeding to be done. See |
weedp |
The p in the p-norm to be used in the weeding. p = 2 (default) is Euclidean distance. |
quiet |
If TRUE, no auxiliary output is generated. The default (FALSE) has information of fits being proformed. |
startfit |
Meant primarily for debugging (for now), allows starting places
for the fitting process. Overrides |
back |
Setting this to true will restore the defaults in numeach. |
Details
A word of warning: it is generally better to use the default, and only use Nelder-Mead to check for spurious modes. SANN takes a long time (and may only find one mode), and CG may not be stable.
If using Nelder-Mead, it must be stressed that Nelder-Mead can take out non-spurious modes or add spurious modes: we have checked visually where we could. Therefore it is wise to use BFGS as the default and if there are modes close to the boundaries, check using Nelder-Mead.
The moving average parameters are in the Box-Jenkins convention: they are
the negative of the parameters given by arima
. That is, the
model to be fit is, in the case of a non-seasonal ARIMA model, phi(B)
(1-B)^d z[t] = theta(B) a[t], where phi(B) = 1 - phi(1) B - ... - phi(p) B^p
and theta(B) = 1 - theta(1) B - ... - theta(q) B^q.
For the useC parameter, a "0" means no C is used; a "1" means C is only used to compute the log-likelihood, but not the theoretical autocovariance function (tacvf); a "2" means that C is used to compute the tacvf and not the log-likelihood; and a "3" means C is used to compute everything.
Value
An object of class "arfima". In it, full information on the fit is given, though not printed under the print.arfima method. The phis are the AR parameters, and the thetas are the MA parameters. Residuals, regression residuals, etc., are all available, along with the parameter values and standard errors. Note that the muHat returned in the arfima object is of the differenced series, if differencing is applied.
Note that if multiple modes are found, they are listed in order of log-likelihood value.
Author(s)
JQ (Justin) Veenstra
References
McLeod, A. I., Yu, H. and Krougly, Z. L. (2007) Algorithms for Linear Time Series Analysis: With R Package Journal of Statistical Software, Vol. 23, Issue 5
Veenstra, J.Q. Persistence and Antipersistence: Theory and Software (PhD Thesis)
P. Borwein (1995) An efficient algorithm for Riemann Zeta function Canadian Math. Soc. Conf. Proc., 27, pp. 29-34.
See Also
arfima.sim
, SeriesJ
,
arfima-package
Examples
set.seed(8564)
sim <- arfima.sim(1000, model = list(phi = c(0.2, 0.1),
dfrac = 0.4, theta = 0.9))
fit <- arfima(sim, order = c(2, 0, 1), back=TRUE)
fit
data(tmpyr)
fit <- arfima(tmpyr, order = c(1, 0, 1), numeach = c(3, 3))
fit
plot(tacvf(fit), maxlag = 30, tacf = TRUE)
data(SeriesJ)
attach(SeriesJ)
fitTF <- arfima(YJ, order= c(2, 0, 0), xreg = XJ, reglist =
list(regpar = c(2, 2, 3)), lmodel = "n")
fitTF
detach(SeriesJ)