trophicSDM_CV {webSDM}R Documentation

Compute K-fold cross-validation predicted values from a fitted trophicSDM model

Description

Once the CV predicted values are obtained, their quality can be evaluated with evaluateModelFit().

Usage

trophicSDM_CV(
  tSDM,
  K,
  partition = NULL,
  prob.cov = FALSE,
  pred_samples = NULL,
  iter = NULL,
  chains = NULL,
  run.parallel = FALSE,
  verbose = FALSE
)

Arguments

tSDM

A trophicSDMfit object obtained with trophicSDM()

K

The number of folds for the K-fold cross validation

partition

Optional parameter. A partition vector to specify a partition in K fold for cross validation

prob.cov

Parameter to predict with trophicSDM with presence-absence data. Whether to use predicted probability of presence (prob.cov = T) or the transformed presence-absences (default, prov.cov = F) to predict species distribution.

pred_samples

Number of samples to draw from species posterior predictive distribution when method = "stan_glm". If NULL, set by the default to the number of iterations/10.

iter

For method = "stan_glm": number of iterations of each MCMC chains to fit the trophicSDM model. Default to the number of iterations used to fit the provided trophicSDMfit object

chains

For method = "stan_glm": number of MCMC chains to fit the trophicSDM model. Default to the number of iterations used to fit the provided trophicSDMfit object

run.parallel

Whether to use parallelise code when possible. Default to TRUE. Can speed up computation time

verbose

Whether to print advances of the algorithm

Value

A list containing:

meanPred

a sites x species matrix of predicted occurrences of species for each site (e.g. probability of presence). With stan_glm the posterior predictive mean is return

Pred975, Pred025

Only for method = "stan_glm", the 97.5% and 2.5% quantiles of the predictive posterior distribution

partition

the partition vector used to compute the K fold cross-validation

Author(s)

Giovanni Poggiato

Examples

data(Y, X, G)
# define abiotic part of the model
env.formula = "~ X_1 + X_2"
# Run the model with bottom-up control using glm as fitting method and no penalisation
# (set iter = 1000 to obtain reliable results)

m = trophicSDM(Y, X, G, env.formula, iter = 50, 
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "stan_glm")

# Run a 3-fold (K=3) cross validation. Predictions is done using presence-absences of preys
# (prob.cov = FALSE, see ?predict.trophicSDM) with 50 draws from the posterior distribution
# (pred_samples = 50)
CV = trophicSDM_CV(m, K = 3, prob.cov = FALSE, pred_samples = 10, run.parallel = FALSE)
# Use predicted values to evaluate model goodness of fit in cross validation
Ypred = CV$meanPred[,colnames(Y)]

evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)

# Now with K = 2 and by specifying the partition of site
m = trophicSDM(Y, X, G, env.formula, iter = 50,
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "glm")
partition = c(rep(1,500),rep(2,500))
CV = trophicSDM_CV(m, K = 2, partition = partition, prob.cov = FALSE,
                   pred_samples = 10, run.parallel = FALSE)
Ypred = CV$meanPred[,colnames(Y)]
evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)

[Package webSDM version 1.1-4 Index]