object |
list object returned from mvgam . See mvgam()
|
formula |
Optional new formula object. Note, mvgam currently does not support dynamic formula
updates such as removal of specific terms with - term . When updating, the entire formula needs
to be supplied
|
trend_formula |
An optional character string specifying the GAM process model formula. If
supplied, a linear predictor will be modelled for the latent trends to capture process model evolution
separately from the observation model. Should not have a response variable specified on the left-hand side
of the formula (i.e. a valid option would be ~ season + s(year) ). Also note that you should not use
the identifier series in this formula to specify effects that vary across time series. Instead you should use
trend . This will ensure that models in which a trend_map is supplied will still work consistently
(i.e. by allowing effects to vary across process models, even when some time series share the same underlying
process model). This feature is only currently available for RW() , AR() and VAR() trend models.
In nmix() family models, the trend_formula is used to set up a linear predictor for the underlying
latent abundance
|
data |
A dataframe or list containing the model response variable and covariates
required by the GAM formula and optional trend_formula . Should include columns:
#'
series (a factor index of the series IDs; the number of levels should be identical
to the number of unique series labels (i.e. n_series = length(levels(data$series)) ))
time (numeric or integer index of the time point for each observation).
For most dynamic trend types available in mvgam (see argument trend_model ), time should be
measured in discrete, regularly spaced intervals (i.e. c(1, 2, 3, ...) ). However you can
use irregularly spaced intervals if using trend_model = CAR(1) , though note that any
temporal intervals that are exactly 0 will be adjusted to a very small number
(1e-12 ) to prevent sampling errors. See an example of CAR() trends in CAR
Should also include any other variables to be included in the linear predictor of formula
|
newdata |
Optional dataframe or list of test data containing at least series and time
in addition to any other variables included in the linear predictor of formula . If included, the
observations in variable y will be set to NA when fitting the model so that posterior
simulations can be obtained
|
trend_model |
character or function specifying the time series dynamics for the latent trend. Options are:
-
None (no latent trend component; i.e. the GAM component is all that contributes to the linear predictor,
and the observation process is the only source of error; similarly to what is estimated by gam )
-
'RW' or RW()
-
'AR1' or AR(p = 1)
-
'AR2' or AR(p = 2)
-
'AR3' or AR(p = 3)
-
'CAR1' or CAR(p = 1)
-
'VAR1' or VAR() (only available in Stan )
-
'PWlogistic , 'PWlinear' or PW() (only available in Stan )
-
'GP' or GP() (Gaussian Process with squared exponential kernel;
only available in Stan )
For all trend types apart from GP() , CAR() and PW() , moving average and/or correlated
process error terms can also be estimated (for example, RW(cor = TRUE) will set up a
multivariate Random Walk if n_series > 1 ). See mvgam_trends for more details
|
trend_map |
Optional data.frame specifying which series should depend on which latent
trends. Useful for allowing multiple series to depend on the same latent trend process, but with
different observation processes. If supplied, a latent factor model is set up by setting
use_lv = TRUE and using the mapping to set up the shared trends. Needs to have column names
series and trend , with integer values in the trend column to state which trend each series
should depend on. The series column should have a single unique entry for each series in the
data (names should perfectly match factor levels of the series variable in data ). See examples
for details
|
use_lv |
logical . If TRUE , use dynamic factors to estimate series'
latent trends in a reduced dimension format. Only available for
RW() , AR() and GP() trend models. Defaults to FALSE
|
n_lv |
integer the number of latent dynamic factors to use if use_lv == TRUE .
Cannot be > n_series . Defaults arbitrarily to min(2, floor(n_series / 2))
|
family |
family specifying the exponential observation family for the series. Currently supported
families are:
gaussian() for real-valued data
betar() for proportional data on (0,1)
lognormal() for non-negative real-valued data
student_t() for real-valued data
Gamma() for non-negative real-valued data
bernoulli() for binary data
poisson() for count data
nb() for overdispersed count data
binomial() for count data with imperfect detection when the number of trials is known;
note that the cbind() function must be used to bind the discrete observations and the discrete number
of trials
beta_binomial() as for binomial() but allows for overdispersion
nmix() for count data with imperfect detection when the number of trials
is unknown and should be modeled via a State-Space N-Mixture model.
The latent states are Poisson, capturing the 'true' latent
abundance, while the observation process is Binomial to account for
imperfect detection.
See mvgam_families for an example of how to use this family
Note that only nb() and poisson() are available if using JAGS as the backend.
Default is poisson() .
See mvgam_families for more details
|
share_obs_params |
logical . If TRUE and the family
has additional family-specific observation parameters (e.g. variance components in
student_t() or gaussian() , or dispersion parameters in nb() or betar() ),
these parameters will be shared across all series. This is handy if you have multiple
time series that you believe share some properties, such as being from the same
species over different spatial units. Default is FALSE .
|
priors |
An optional data.frame with prior
definitions (in JAGS or Stan syntax). if using Stan, this can also be an object of
class brmsprior (see. prior for details). See get_mvgam_priors and
'Details' for more information on changing default prior distributions
|
chains |
integer specifying the number of parallel chains for the model. Ignored
if algorithm %in% c('meanfield', 'fullrank', 'pathfinder', 'laplace')
|
burnin |
integer specifying the number of warmup iterations of the Markov chain to run
to tune sampling algorithms. Ignored
if algorithm %in% c('meanfield', 'fullrank', 'pathfinder', 'laplace')
|
samples |
integer specifying the number of post-warmup iterations of the Markov chain to run for
sampling the posterior distribution
|
threads |
integer Experimental option to use multithreading for within-chain
parallelisation in Stan . We recommend its use only if you are experienced with
Stan 's reduce_sum function and have a slow running model that cannot be sped
up by any other means. Only available for some families(poisson() , nb() , gaussian() ) and
when using Cmdstan as the backend
|
algorithm |
Character string naming the estimation approach to use.
Options are "sampling" for MCMC (the default), "meanfield" for
variational inference with factorized normal distributions,
"fullrank" for variational inference with a multivariate normal
distribution, "laplace" for a Laplace approximation (only available
when using cmdstanr as the backend) or "pathfinder" for the pathfinder
algorithm (only currently available when using cmdstanr as the backend).
Can be set globally for the current R session via the
"brms.algorithm" option (see options ). Limited testing
suggests that "meanfield" performs best out of the non-MCMC approximations for
dynamic GAMs, possibly because of the difficulties estimating covariances among the
many spline parameters and latent trend parameters. But rigorous testing has not
been carried out
|
lfo |
Logical indicating whether this is part of a call to lfo_cv.mvgam. Returns a
lighter version of the model with no residuals and fewer monitored parameters to speed up
post-processing. But other downstream functions will not work properly, so users should always
leave this set as FALSE
|
... |
Other arguments to be passed to mvgam
|