likelihood |
A string specifying the likelihood function (distribution) of the response variable.
Available options:
"gaussian"
"bernoulli_probit": binary data with Bernoulli likelihood and a probit link function
"bernoulli_logit": binary data with Bernoulli likelihood and a logit link function
"gamma": gamma distribution with a with log link function
"poisson": Poisson distribution with a with log link function
"negative_binomial": negative binomial distribution with a with log link function
Note: other likelihoods could be implemented upon request
|
group_data |
A vector or matrix whose columns are categorical grouping variables.
The elements being group levels defining grouped random effects.
The elements of 'group_data' can be integer, double, or character.
The number of columns corresponds to the number of grouped (intercept) random effects
|
group_rand_coef_data |
A vector or matrix with numeric covariate data
for grouped random coefficients
|
ind_effect_group_rand_coef |
A vector with integer indices that
indicate the corresponding categorical grouping variable (=columns) in 'group_data' for
every covariate in 'group_rand_coef_data'. Counting starts at 1.
The length of this index vector must equal the number of covariates in 'group_rand_coef_data'.
For instance, c(1,1,2) means that the first two covariates (=first two columns) in 'group_rand_coef_data'
have random coefficients corresponding to the first categorical grouping variable (=first column) in 'group_data',
and the third covariate (=third column) in 'group_rand_coef_data' has a random coefficient
corresponding to the second grouping variable (=second column) in 'group_data'
|
drop_intercept_group_rand_effect |
A vector of type logical (boolean).
Indicates whether intercept random effects are dropped (only for random coefficients).
If drop_intercept_group_rand_effect[k] is TRUE, the intercept random effect number k is dropped / not included.
Only random effects with random slopes can be dropped.
|
gp_coords |
A matrix with numeric coordinates (= inputs / features) for defining Gaussian processes
|
gp_rand_coef_data |
A vector or matrix with numeric covariate data for
Gaussian process random coefficients
|
cov_function |
A string specifying the covariance function for the Gaussian process.
Available options:
"exponential": Exponential covariance function (using the parametrization of Diggle and Ribeiro, 2007)
"gaussian": Gaussian, aka squared exponential, covariance function (using the parametrization of Diggle and Ribeiro, 2007)
"matern": Matern covariance function with the smoothness specified by
the cov_fct_shape parameter (using the parametrization of Rasmussen and Williams, 2006)
"powered_exponential": powered exponential covariance function with the exponent specified by
the cov_fct_shape parameter (using the parametrization of Diggle and Ribeiro, 2007)
"wendland": Compactly supported Wendland covariance function (using the parametrization of Bevilacqua et al., 2019, AOS)
"matern_space_time": Spatio-temporal Matern covariance function with different range parameters for space and time.
Note that the first column in gp_coords must correspond to the time dimension
"matern_ard": anisotropic Matern covariance function with Automatic Relevance Determination (ARD),
i.e., with a different range parameter for every coordinate dimension / column of gp_coords
"gaussian_ard": anisotropic Gaussian, aka squared exponential, covariance function with Automatic Relevance Determination (ARD),
i.e., with a different range parameter for every coordinate dimension / column of gp_coords
|
cov_fct_shape |
A numeric specifying the shape parameter of the covariance function
(=smoothness parameter for Matern covariance)
This parameter is irrelevant for some covariance functions such as the exponential or Gaussian
|
gp_approx |
A string specifying the large data approximation
for Gaussian processes. Available options:
"none": No approximation
"vecchia": A Vecchia approximation; see Sigrist (2022, JMLR) for more details
"tapering": The covariance function is multiplied by
a compactly supported Wendland correlation function
"fitc": Fully Independent Training Conditional approximation aka
modified predictive process approximation; see Gyger, Furrer, and Sigrist (2024) for more details
"full_scale_tapering": A full scale approximation combining an
inducing point / predictive process approximation with tapering on the residual process;
see Gyger, Furrer, and Sigrist (2024) for more details
|
cov_fct_taper_range |
A numeric specifying the range parameter
of the Wendland covariance function and Wendland correlation taper function.
We follow the notation of Bevilacqua et al. (2019, AOS)
|
cov_fct_taper_shape |
A numeric specifying the shape (=smoothness) parameter
of the Wendland covariance function and Wendland correlation taper function.
We follow the notation of Bevilacqua et al. (2019, AOS)
|
num_neighbors |
An integer specifying the number of neighbors for
the Vecchia approximation. Note: for prediction, the number of neighbors can
be set through the 'num_neighbors_pred' parameter in the 'set_prediction_data'
function. By default, num_neighbors_pred = 2 * num_neighbors. Further,
the type of Vecchia approximation used for making predictions is set through
the 'vecchia_pred_type' parameter in the 'set_prediction_data' function
|
vecchia_ordering |
A string specifying the ordering used in
the Vecchia approximation. Available options:
"none": the default ordering in the data is used
"random": a random ordering
"time": ordering accorrding to time (only for space-time models)
"time_random_space": ordering according to time and randomly for all
spatial points with the same time points (only for space-time models)
|
ind_points_selection |
A string specifying the method for choosing inducing points
Available options:
"kmeans++: the k-means++ algorithm
"cover_tree": the cover tree algorithm
"random": random selection from data points
|
num_ind_points |
An integer specifying the number of inducing
points / knots for, e.g., a predictive process approximation
|
cover_tree_radius |
A numeric specifying the radius (= "spatial resolution")
for the cover tree algorithm
|
matrix_inversion_method |
A string specifying the method used for inverting covariance matrices.
Available options:
"cholesky": Cholesky factorization
"iterative": iterative methods. A combination of conjugate gradient, Lanczos algorithm, and other methods.
This is currently only supported for the following cases:
likelihood != "gaussian" and gp_approx == "vecchia" (non-Gaussian likelihoods with a Vecchia-Laplace approximation)
likelihood == "gaussian" and gp_approx == "full_scale_tapering" (Gaussian likelihood with a full-scale tapering approximation)
|
seed |
An integer specifying the seed used for model creation
(e.g., random ordering in Vecchia approximation)
|
cluster_ids |
A vector with elements indicating independent realizations of
random effects / Gaussian processes (same values = same process realization).
The elements of 'cluster_ids' can be integer, double, or character.
|
free_raw_data |
A boolean . If TRUE, the data (groups, coordinates, covariate data for random coefficients)
is freed in R after initialization
|
vecchia_approx |
Discontinued. Use the argument gp_approx instead
|
vecchia_pred_type |
A string specifying the type of Vecchia approximation used for making predictions.
This is discontinued here. Use the function 'set_prediction_data' to specify this
|
num_neighbors_pred |
an integer specifying the number of neighbors for making predictions.
This is discontinued here. Use the function 'set_prediction_data' to specify this
|