trophicSDM_CV {webSDM} | R Documentation |
Compute K-fold cross-validation predicted values from a fitted trophicSDM model
Description
Once the CV predicted values are obtained, their quality can be evaluated with evaluateModelFit()
.
Usage
trophicSDM_CV(
tSDM,
K,
partition = NULL,
prob.cov = FALSE,
pred_samples = NULL,
iter = NULL,
chains = NULL,
run.parallel = FALSE,
verbose = FALSE
)
Arguments
tSDM |
A trophicSDMfit object obtained with trophicSDM() |
K |
The number of folds for the K-fold cross validation |
partition |
Optional parameter. A partition vector to specify a partition in K fold for cross validation |
prob.cov |
Parameter to predict with trophicSDM with presence-absence data. Whether to use predicted probability of presence (prob.cov = T) or the transformed presence-absences (default, prov.cov = F) to predict species distribution. |
pred_samples |
Number of samples to draw from species posterior predictive distribution when method = "stan_glm". If NULL, set by the default to the number of iterations/10. |
iter |
For method = "stan_glm": number of iterations of each MCMC chains to fit the trophicSDM model. Default to the number of iterations used to fit the provided trophicSDMfit object |
chains |
For method = "stan_glm": number of MCMC chains to fit the trophicSDM model. Default to the number of iterations used to fit the provided trophicSDMfit object |
run.parallel |
Whether to use parallelise code when possible. Default to TRUE. Can speed up computation time |
verbose |
Whether to print advances of the algorithm |
Value
A list containing:
meanPred |
a sites x species matrix of predicted occurrences of species for each site (e.g. probability of presence). With stan_glm the posterior predictive mean is return |
Pred975 , Pred025 |
Only for method = "stan_glm", the 97.5% and 2.5% quantiles of the predictive posterior distribution |
partition |
the partition vector used to compute the K fold cross-validation |
Author(s)
Giovanni Poggiato
Examples
data(Y, X, G)
# define abiotic part of the model
env.formula = "~ X_1 + X_2"
# Run the model with bottom-up control using glm as fitting method and no penalisation
# (set iter = 1000 to obtain reliable results)
m = trophicSDM(Y, X, G, env.formula, iter = 50,
family = binomial(link = "logit"), penal = NULL,
mode = "prey", method = "stan_glm")
# Run a 3-fold (K=3) cross validation. Predictions is done using presence-absences of preys
# (prob.cov = FALSE, see ?predict.trophicSDM) with 50 draws from the posterior distribution
# (pred_samples = 50)
CV = trophicSDM_CV(m, K = 3, prob.cov = FALSE, pred_samples = 10, run.parallel = FALSE)
# Use predicted values to evaluate model goodness of fit in cross validation
Ypred = CV$meanPred[,colnames(Y)]
evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)
# Now with K = 2 and by specifying the partition of site
m = trophicSDM(Y, X, G, env.formula, iter = 50,
family = binomial(link = "logit"), penal = NULL,
mode = "prey", method = "glm")
partition = c(rep(1,500),rep(2,500))
CV = trophicSDM_CV(m, K = 2, partition = partition, prob.cov = FALSE,
pred_samples = 10, run.parallel = FALSE)
Ypred = CV$meanPred[,colnames(Y)]
evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)