lfJSDM {spOccupancy} | R Documentation |
Function for Fitting a Latent Factor Joint Species Distribution Model
Description
Function for fitting a joint species distribution model with species correlations. This model does not explicitly account for imperfect detection (see lfMsPGOcc()
). We use Polya-gamma latent variables and a factor modeling approach.
Usage
lfJSDM(formula, data, inits, priors, n.factors,
n.samples, n.omp.threads = 1, verbose = TRUE, n.report = 100,
n.burn = round(.10 * n.samples), n.thin = 1, n.chains = 1,
k.fold, k.fold.threads = 1, k.fold.seed, k.fold.only = FALSE, ...)
Arguments
formula |
a symbolic description of the model to be fit for the model using R's model syntax. Only right-hand side of formula is specified. See example below. Random intercepts are allowed using lme4 syntax (Bates et al. 2015). |
data |
a list containing data necessary for model fitting.
Valid tags are |
inits |
a list with each tag corresponding to a parameter name.
Valid tags are |
priors |
a list with each tag corresponding to a parameter name.
Valid tags are |
n.factors |
the number of factors to use in the latent factor model approach.
Typically, the number of factors is set to be small (e.g., 4-5) relative to the
total number of species in the community, which will lead to substantial
decreases in computation time. However, the value can be anywhere
between 0 and N (the number of species in the community). When set to 0, the model
assumes there are no residual species correlations, which is equivalent to the
|
n.samples |
the number of posterior samples to collect in each chain. |
n.omp.threads |
a positive integer indicating the number of threads
to use for SMP parallel processing. The package must be compiled for
OpenMP support. For most Intel-based machines, we recommend setting
|
verbose |
if |
n.report |
the interval to report MCMC progress. |
n.burn |
the number of samples out of the total |
n.thin |
the thinning interval for collection of MCMC samples. The
thinning occurs after the |
n.chains |
the number of chains to run in sequence. |
k.fold |
specifies the number of k folds for cross-validation.
If not specified as an argument, then cross-validation is not performed
and |
k.fold.threads |
number of threads to use for cross-validation. If
|
k.fold.seed |
seed used to split data set into |
k.fold.only |
a logical value indicating whether to only perform
cross-validation ( |
... |
currently no additional arguments |
Value
An object of class lfJSDM
that is a list comprised of:
beta.comm.samples |
a |
tau.sq.beta.samples |
a |
beta.samples |
a |
lambda.samples |
a |
psi.samples |
a three-dimensional array of posterior samples for the latent probability of occurrence/detection values for each species. |
sigma.sq.psi.samples |
a |
w.samples |
a three-dimensional array of posterior samples for the latent effects for each latent factor. |
beta.star.samples |
a |
like.samples |
a three-dimensional array of posterior samples for the likelihood value associated with each site and species. Used for calculating WAIC. |
rhat |
a list of Gelman-Rubin diagnostic values for some of the model parameters. |
ESS |
a list of effective sample sizes for some of the model parameters. |
run.time |
MCMC sampler execution time reported using |
k.fold.deviance |
vector of scoring rules (deviance) from k-fold cross-validation.
A separate value is reported for each species.
Only included if |
The return object will include additional objects used for
subsequent prediction and/or model fit evaluation. Note that detection probability
estimated values are not included in the model object, but can be extracted
using fitted()
.
Note
Some of the underlying code used for generating random numbers from the Polya-Gamma distribution is taken from the pgdraw package written by Daniel F. Schmidt and Enes Makalic. Their code implements Algorithm 6 in PhD thesis of Jesse Bennett Windle (2013) https://repositories.lib.utexas.edu/handle/2152/21842.
Author(s)
Jeffrey W. Doser doserjef@msu.edu,
Andrew O. Finley finleya@msu.edu
References
Polson, N.G., J.G. Scott, and J. Windle. (2013) Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables. Journal of the American Statistical Association, 108:1339-1349.
Bates, Douglas, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
Hooten, M. B., and Hobbs, N. T. (2015). A guide to Bayesian model selection for ecologists. Ecological monographs, 85(1), 3-28.
Examples
set.seed(400)
J.x <- 10
J.y <- 10
J <- J.x * J.y
n.rep <- rep(1, J)
N <- 10
# Community-level covariate effects
# Occurrence
beta.mean <- c(0.2, 0.6, 1.5)
p.occ <- length(beta.mean)
tau.sq.beta <- c(0.6, 1.2, 1.7)
# Detection
# Fix this to be constant and really close to 1.
alpha.mean <- c(9)
tau.sq.alpha <- c(0.05)
p.det <- length(alpha.mean)
# Random effects
# Include a single random effect
psi.RE <- list(levels = c(20),
sigma.sq.psi = c(2))
p.RE <- list()
# Draw species-level effects from community means.
beta <- matrix(NA, nrow = N, ncol = p.occ)
alpha <- matrix(NA, nrow = N, ncol = p.det)
for (i in 1:p.occ) {
beta[, i] <- rnorm(N, beta.mean[i], sqrt(tau.sq.beta[i]))
}
for (i in 1:p.det) {
alpha[, i] <- rnorm(N, alpha.mean[i], sqrt(tau.sq.alpha[i]))
}
alpha.true <- alpha
# Factor model
factor.model <- TRUE
n.factors <- 4
dat <- simMsOcc(J.x = J.x, J.y = J.y, n.rep = n.rep, N = N, beta = beta, alpha = alpha,
psi.RE = psi.RE, p.RE = p.RE, sp = FALSE,
factor.model = TRUE, n.factors = 4)
X <- dat$X
y <- dat$y
X.re <- dat$X.re
coords <- dat$coords
occ.covs <- cbind(X, X.re)
colnames(occ.covs) <- c('int', 'occ.cov.1', 'occ.cov.2', 'occ.re.1')
data.list <- list(y = y[, , 1],
covs = occ.covs,
coords = coords)
# Priors
prior.list <- list(beta.comm.normal = list(mean = 0, var = 2.72),
tau.sq.beta.ig = list(a = 0.1, b = 0.1))
inits.list <- list(beta.comm = 0, beta = 0, tau.sq.beta = 1)
out <- lfJSDM(formula = ~ occ.cov.1 + occ.cov.2 + (1 | occ.re.1),
data = data.list,
inits = inits.list,
priors = prior.list,
n.factors = 4,
n.samples = 1000,
n.report = 500,
n.burn = 500,
n.thin = 2,
n.chains = 1)
summary(out)