fitGSMAR {uGMAR}R Documentation

Estimate Gaussian or Student's t Mixture Autoregressive model

Description

fitGSMAR estimates GMAR, StMAR, or G-StMAR model in two phases. In the first phase, a genetic algorithm is employed to find starting values for a gradient based method. In the second phase, the gradient based variable metric algorithm is utilized to accurately converge to a local maximum or a saddle point near each starting value. Parallel computing is used to conduct multiple rounds of estimations in parallel.

Usage

fitGSMAR(
  data,
  p,
  M,
  model = c("GMAR", "StMAR", "G-StMAR"),
  restricted = FALSE,
  constraints = NULL,
  conditional = TRUE,
  parametrization = c("intercept", "mean"),
  ncalls = round(10 + 9 * log(sum(M))),
  ncores = 2,
  maxit = 500,
  seeds = NULL,
  print_res = TRUE,
  filter_estimates = TRUE,
  ...
)

Arguments

data

a numeric vector or class 'ts' object containing the data. NA values are not supported.

p

a positive integer specifying the autoregressive order of the model.

M
For GMAR and StMAR models:

a positive integer specifying the number of mixture components.

For G-StMAR models:

a size (2x1) integer vector specifying the number of GMAR type components M1 in the first element and StMAR type components M2 in the second element. The total number of mixture components is M=M1+M2.

model

is "GMAR", "StMAR", or "G-StMAR" model considered? In the G-StMAR model, the first M1 components are GMAR type and the rest M2 components are StMAR type.

restricted

a logical argument stating whether the AR coefficients \phi_{m,1},...,\phi_{m,p} are restricted to be the same for all regimes.

constraints

specifies linear constraints imposed to each regime's autoregressive parameters separately.

For non-restricted models:

a list of size (pxq_{m}) constraint matrices C_{m} of full column rank satisfying \phi_{m}=C_{m}\psi_{m} for all m=1,...,M, where \phi_{m}=(\phi_{m,1},...,\phi_{m,p}) and \psi_{m}=(\psi_{m,1},...,\psi_{m,q_{m}}).

For restricted models:

a size (pxq) constraint matrix C of full column rank satisfying \phi=C\psi, where \phi=(\phi_{1},...,\phi_{p}) and \psi=\psi_{1},...,\psi_{q}.

The symbol \phi denotes an AR coefficient. Note that regardless of any constraints, the autoregressive order is always p for all regimes. Ignore or set to NULL if applying linear constraints is not desired.

conditional

a logical argument specifying whether the conditional or exact log-likelihood function should be used.

parametrization

is the model parametrized with the "intercepts" \phi_{m,0} or "means" \mu_{m} = \phi_{m,0}/(1-\sum\phi_{i,m})?

ncalls

a positive integer specifying how many rounds of estimation should be conducted. The estimation results may vary from round to round because of multimodality of the log-likelihood function and the randomness associated with the genetic algorithm.

ncores

the number of CPU cores to be used in the estimation process.

maxit

the maximum number of iterations for the variable metric algorithm.

seeds

a length ncalls vector containing the random number generator seed for each call to the genetic algorithm, or NULL for not initializing the seed. Exists for the purpose of creating reproducible results.

print_res

should the estimation results be printed?

filter_estimates

should likely inappropriate estimates be filtered? See details.

...

additional settings passed to the function GAfit employing the genetic algorithm.

Details

Because of complexity and multimodality of the log-likelihood function, it's not guaranteed that the estimation algorithm will end up in the global maximum point. It's often expected that most of the estimation rounds will end up in some local maximum point instead, and therefore a number of estimation rounds is required for reliable results. Because of the nature of the models, the estimation may fail particularly in the cases where the number of mixture components is chosen too large. Note that the genetic algorithm is designed to avoid solutions with mixing weights of some regimes too close to zero at almost all times ("redundant regimes") but the settings can, however, be adjusted (see ?GAfit).

If the iteration limit for the variable metric algorithm (maxit) is reached, one can continue the estimation by iterating more with the function iterate_more.

The core of the genetic algorithm is mostly based on the description by Dorsey and Mayer (1995). It utilizes a slightly modified version the individually adaptive crossover and mutation rates described by Patnaik and Srinivas (1994) and employs (50%) fitness inheritance discussed by Smith, Dike and Stegmann (1995). Large (in absolute value) but stationary AR parameter values are generated with the algorithm proposed by Monahan (1984).

The variable metric algorithm (or quasi-Newton method, Nash (1990, algorithm 21)) used in the second phase is implemented with function the optim from the package stats.

Additional Notes about the estimates:

Sometimes the found MLE is very close to the boundary of the stationarity region some regime, the related variance parameter is very small, and the associated mixing weights are "spiky". This kind of estimates often maximize the log-likelihood function for a technical reason that induces by the endogenously determined mixing weights. In such cases, it might be more appropriate to consider the next-best local maximum point of the log-likelihood function that is well inside the parameter space. Models based local-only maximum points can be built with the function alt_gsmar by adjusting the argument which_largest accordingly.

Some mixture components of the StMAR model may sometimes get very large estimates for the degrees of freedom parameters. Such parameters are weakly identified and induce various numerical problems. However, mixture components with large degree of freedom parameter estimates are similar to the mixture components of the GMAR model. It's hence advisable to further estimate a G-StMAR model by allowing the mixture components with large degrees of freedom parameter estimates to be GMAR type with the function stmar_to_gstmar.

Filtering inappropriate estimates: If filter_estimates == TRUE, the function will automatically filter out estimates that it deems "inappropriate". That is, estimates that are not likely solutions of interest. Specifically, it filters out solutions that incorporate regimes with any modulus of the roots of the AR polynomial less than 1.0015; a variance parameter estimat near zero (less than 0.0015); mixing weights such that they are close to zero for almost all t for at least one regime; or mixing weight parameter estimate close to zero (or one). You can also set filter_estimates=FALSE and find the solutions of interest yourself by using the function alt_gsmar.

Value

Returns an object of class 'gsmar' defining the estimated GMAR, StMAR or G-StMAR model. The returned object contains estimated mixing weights, some conditional and unconditional moments, and quantile residuals. Note that the first p observations are taken as the initial values, so the mixing weights, conditional moments, and quantile residuals start from the p+1:th observation (interpreted as t=1). In addition, the returned object contains the estimates and log-likelihoods from all of the estimation rounds. See ?GSMAR for the form of the parameter vector, if needed.

S3 methods

The following S3 methods are supported for class 'gsmar' objects: print, summary, plot, predict, simulate, logLik, residuals.

References

See Also

GSMAR, iterate_more, , stmar_to_gstmar, add_data, profile_logliks, swap_parametrization, get_gradient, simulate.gsmar, predict.gsmar, diagnostic_plot, quantile_residual_tests, cond_moments, uncond_moments, LR_test, Wald_test

Examples


## These are long running examples that use parallel computing.
## The below examples take approximately 90 seconds to run.

## Note that the number of estimation rounds (ncalls) is relatively small
## in the below examples to reduce the time required for running the examples.
## For reliable results, a large number of estimation rounds is recommended!

# GMAR model
fit12 <- fitGSMAR(data=simudata, p=1, M=2, model="GMAR", ncalls=4, seeds=1:4)
summary(fit12)
plot(fit12)
profile_logliks(fit12)
diagnostic_plot(fit12)

# StMAR model (large estimate of the degrees of freedom)
fit42t <- fitGSMAR(data=M10Y1Y, p=4, M=2, model="StMAR", ncalls=2, seeds=c(1, 6))
summary(fit42t) # Overly large 2nd regime degrees of freedom estimate!
fit42gs <- stmar_to_gstmar(fit42t) # Switch to G-StMAR model
summary(fit42gs) # An appropriate G-StMVAR model with one G and one t regime
plot(fit42gs)

# Restricted StMAR model
fit42r <- fitGSMAR(M10Y1Y, p=4, M=2, model="StMAR", restricted=TRUE,
                   ncalls=2, seeds=1:2)
fit42r

# G-StMAR model with one GMAR type and one StMAR type regime
fit42gs <- fitGSMAR(M10Y1Y, p=4, M=c(1, 1), model="G-StMAR",
                    ncalls=1, seeds=4)
fit42gs

# The following three examples demonstrate how to apply linear constraints
# to the autoregressive (AR) parameters.

# Two-regime GMAR p=2 model with the second AR coeffiecient of
# of the second regime contrained to zero.
C22 <- list(diag(1, ncol=2, nrow=2), as.matrix(c(1, 0)))
fit22c <- fitGSMAR(M10Y1Y, p=2, M=2, constraints=C22, ncalls=1, seeds=6)
fit22c

# StMAR(3, 1) model with the second order AR coefficient constrained to zero.
C31 <- list(matrix(c(1, 0, 0, 0, 0, 1), ncol=2))
fit31tc <- fitGSMAR(M10Y1Y, p=3, M=1, model="StMAR", constraints=C31,
                    ncalls=1, seeds=1)
fit31tc

# Such StMAR(3, 2) model that the AR coefficients are restricted to be
# the same for both regimes and the second AR coefficients are
# constrained to zero.
fit32rc <- fitGSMAR(M10Y1Y, p=3, M=2, model="StMAR", restricted=TRUE,
                    constraints=matrix(c(1, 0, 0, 0, 0, 1), ncol=2),
                    ncalls=1, seeds=1)
fit32rc


[Package uGMAR version 3.5.0 Index]