about.lvs {boral} | R Documentation |

## Correlation structure for latent variables

### Description

This help file provides more information how (non-independence) correlation structures can be assumed for latent variables.

### Details

In the main boral function, when latent varaibles are included, the default option is to assume that the latent variables are independent across the rows (sites) of the response matrix i.e., `lv.type = "independent"`

. That is, `\bm{u}_i \sim N(\bm{0},\bm{I}_d)`

where `d = num.lv`

. This is useful when we want to model between species correlations (is a parsimonious manner), but it does make an assumption that sites are independent.

If one *a-priori* believes that the sites are, in fact, correlated e.g., due to spatial correlation, and that it cannot be sufficiently well accounted for by row effects (see comment below), then we can account for this by assuming a non-independence correlation structure for the the latent variables across sites. Note however we continue to assume that the `d`

latent variables are still independent of one another. That is, if we let `\bm{u}_i = (u_{i1}, \ldots, u_{id})`

, then we assume that for `l = 1,\ldots,d`

,

`(u_{1l}, u_{2l}, \ldots, u_{nl}) \sim N(\bm{0}, \bm{\Sigma}),`

where `\bm{\Sigma}`

is some correlation matrix. When `\bm{\Sigma} = \bm{I}_n`

then we are back in the independence case. However, if we allow for the off-diagonals to be non-zero, then we the latent variables to be correlated, `\Sigma_{ij} = Cov(u_{il}, u_{jl})`

. This in turn induces correlation across sites and species i.e., two species at two different sites are now correlated because of the correlation across sites.

While there are fancier structures and attempts at accounting for correlations between sites (Cressie and Wikle, 2015), in boral we assume relatively simple structures. Specifically, we can assume that sites further away are less correlated, and so `\Sigma`

can be characterized based on a distance matrix `distmat`

and associated spatial covariance parameters which require estimation. Indeed, such simple spatial latent variable models have become rather popular in community ecology of late, at least as a first attempt at accounting for spatial (and also temporal) correlation e.g., Thorson et al., (2015, 2016); Ovaskainen et al., (2016, 2017).

At the moment, several correlation structures are permitted. Let `D_{ij}`

denote the distance between site `i`

and `j`

i.e., entry `(i,j)`

in `distmat`

. Also, let `(\vartheta_1,\vartheta_2)`

denote the two spatial covariance parameters (noting that the second parameter is not required for some of structures). Then we have: 1) `lv.type = "exponential"`

such that `\Sigma_{ij} = \exp(-D_{ij}/\vartheta_1)`

; 2) `lv.type = "squared.exponential"`

, such that `\Sigma_{ij} = \exp(-D_{ij}/\vartheta_1^2)`

; 3) `lv.type = "power.exponential"`

, such that `\Sigma_{ij} = \exp(-(D_{ij}/\vartheta_1)^{\vartheta_2})`

where `\vartheta_1 \in (0,2]`

; 4) `lv.type = "spherical"`

, such that `(D_{ij} < \vartheta_1)*(1 - 1.5*D_{ij}/\vartheta_1 + 0.5*(D_{ij}/\vartheta_1)^3)`

. We refer the reader to the `geoR`

and the function `cov.spatial`

for more, simple information on spatial covariance functions (Ribeiro Jr and Diggle, 2016).

It is important to keep in mind that moving away from an independence correlation structure for the latent variables *massively* increases computation time for MCMC sampling (and indeed any estimation method for latent variable models). Given JAGS is not the fastest of methods when it comes to MCMC sampling, then one should be cautious about moving away from indepndence. For example, if you *a-priori* have a nested experimental design which is inducing spatial correlation, then it is much faster and more effective to include (multiple) row effects in the model to account for this spatial correlation instead.

### Author(s)

Francis K.C. Hui [aut, cre], Wade Blanchard [aut]

Maintainer: Francis K.C. Hui <fhui28@gmail.com>

### References

Cressie, N. and Wikle, C. K. (2015) Statistics for Spatio-temporal Data. John Wiley & Sons.

Ovaskainen et al. (2016). Uncovering hidden spatial structure in species communities with spatially explicit joint species distribution models. Methods in Ecology and Evolution, 7, 428-436.

Ovaskainen et al. (2017). How to make more out of community data? A conceptual framework and its implementation as models and software. Ecology Letters, 20, 561-576.

Ribeiro Jr, P. J., and Diggle P. J., (2016). geoR: Analysis of Geostatistical Data. R package version 1.7-5.2. https://CRAN.R-project.org/package=geoR.

Thorson et al. (2016). Joint dynamic species distribution models: a tool for community ordination and spatio-temporal monitoring. Global Ecology and Biogeography, 25, 1144-1158

Thorson et al. (2015). Spatial factor analysis: a new tool for estimating joint species distributions and correlations in species range. Methods in Ecology and Evolution, 6, 627-63

### See Also

`boral`

for the main boral fitting function.

### Examples

```
library(mvabund) ## Load a dataset from the mvabund package
data(spider)
y <- spider$abun
X <- scale(spider$x)
n <- nrow(y)
p <- ncol(y)
## NOTE: The example below is taken directly from the boral help file
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100,
n.thin = 1)
testpath <- file.path(tempdir(), "jagsboralmodel.txt")
## Not run:
## Example 2d - model with environmental covariates and
## two structured latent variables using fake distance matrix
fakedistmat <- as.matrix(dist(1:n))
spiderfit_lvstruc <- boral(y, X = X, family = "negative.binomial",
lv.control = list(num.lv = 2, type = "exponential", distmat = fakedistmat),
mcmc.control = example_mcmc_control, model.name = testpath)
summary(spiderfit_lvstruc)
## End(Not run)
```

*boral*version 2.0.2 Index]