run_eDITH_BT {eDITH}R Documentation

Run eDITH with BayesianTools

Description

Function that runs a Bayesian sampler estimating parameters of an eDITH model

Usage

run_eDITH_BT(data, river, covariates = NULL, Z.normalize = TRUE,
           use.AEM = FALSE, n.AEM = NULL, par.AEM = NULL,
           no.det = FALSE, ll.type = "norm", source.area = "AG",
           mcmc.settings = NULL, likelihood = NULL, 
		   prior = NULL, sampler.type = "DREAMzs",
           tau.prior = list(spec = "lnorm", a = 0, b = Inf, 
			meanlog = log(5), sd = sqrt(log(5) - log(4))),
           log_p0.prior = list(spec="unif",min=-20, max=0),
           beta.prior = list(spec="norm",sd=1),
           sigma.prior = list(spec="unif",min=0, max=max(data$values, na.rm = TRUE)),
           omega.prior = list(spec="unif",min=1, max=10*max(data$values, na.rm = TRUE)),
           Cstar.prior = list(spec="unif",min=0, max=max(data$values, na.rm = TRUE)),
		   verbose = FALSE)

Arguments

data

eDNA data. Data frame containing columns ID (index of the AG node/reach where the eDNA sample was taken) and values (value of the eDNA measurement, expressed as concentration or number of reads).

river

A river object generated via aggregate_river.

covariates

Data frame containing covariate values for all river reaches. If NULL (default option), production rates are estimated via AEMs.

Z.normalize

Logical. Should covariates be Z-normalized?

use.AEM

Logical. Should eigenvectors based on AEMs be used as covariates? If covariates = NULL, it is set to TRUE. If TRUE and covariates are provided, AEM eigenvectors are appended to the covariates data frame.

n.AEM

Number of AEM eigenvectors (sorted by the decreasing respective eigenvalue) to be used as covariates. If par.AEM$moranI = TRUE, this parameter is not used. Instead, the eigenvectors with significantly positive spatial autocorrelation are used as AEM covariates.

par.AEM

List of additional parameters that are passed to river_to_AEM for calculation of AEMs. In particular, par.AEM$moranI = TRUE imposes the use of AEM covariates with significantly positive spatial autocorrelation based on Moran's I statistic.

no.det

Logical. Should a probability of non-detection be included in the model?

ll.type

Character. String defining the error distribution used in the log-likelihood formulation. Allowed values are norm (for normal distribution), lnorm (for lognormal distribution), nbinom (for negative binomial distribution) and geom (for geometric distribution). The two latter choices are suited when eDNA data are expressed as read numbers, while norm and lnorm are better suited to eDNA concentrations.

source.area

Defines the extent of the source area of a node. Possible values are "AG" (if the source area is the reach surface, i.e. length*width), "SC" (if the source area is the subcatchment area), or, alternatively, a vector with length river$AG$nodes.

mcmc.settings

List. It is passed as argument settings in runMCMC. Default is list(iterations = 2.7e6, burnin=1.8e6, message = TRUE, thin = 10).

likelihood

Likelihood function to be passed as likelihood argument to createBayesianSetup. If not specified, it is generated based on arguments no.det and ll.type. If a custom likelihood is specified, a custom prior must also be specified.

prior

Prior function to be passed as prior argument to createBayesianSetup. If not specified, it is generated based on the *.prior arguments provided. If a user-defined prior is provided, parameter names must be included in prior$lower, prior$upper (see example).

sampler.type

Character. It is passed as argument sampler in runMCMC.

tau.prior

List that defines the prior distribution for the decay time parameter tau. See details.

log_p0.prior

List that defines the prior distribution for the logarithm (in base 10) of the baseline production rate p0. See details. If covariates = NULL, this defines the prior distribution for the logarithm (in base 10) of production rates for all river reaches.

beta.prior

List that defines the prior distribution for the covariate effects beta. See details. If a single spec is provided, the same prior distribution is specified for all beta parameters. Alternatively, if spec (and the other arguments, if provided) is a vector with length equal to the number of covariates included, different prior distributions can be specified for the different beta parameters.

sigma.prior

List that defines the prior distribution for the standard deviation of the measurement error when ll.type is "norm" or "lnorm". It is not used if ll.type = "nbinom". See details.

omega.prior

List that defines the prior distribution for the overdispersion parameter omega of the measurement error when ll.type = "nbinom". It is not used if ll.type is "norm" or "lnorm". See details.

Cstar.prior

List that defines the prior distribution for the Cstar parameter controlling the probability of no detection. It is only used if no.det = TRUE. See details.

verbose

Logical. Should console output be displayed?

Details

The arguments of the type *.prior consist in the lists of arguments required by dtrunc (except the first argument x).

By default, AEMs are computed without attributing weights to the edges of the river network. Use e.g. par.AEM = list(weight = "gravity") to attribute weights.

Value

A list with objects:

param_map

Vector of named parameters corresponding to the maximum a posteriori estimate. It is the output of the call to MAP.

p_map

Vector of best-fit eDNA production rates corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes.

C_map

Vector of best-fit eDNA values (in the same unit as data$values, i.e. concentrations or read numbers) corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes.

probDet_map

Vector of best-fit detection probabilities corresponding to the maximum a posteriori parameter estimate param_map. It has length equal to river$AG$nNodes. If a custom likelihood is provided, this is a vector of null length (in which case the user should calculate the probability of detection independently, based on the chosen likelihood).

cI

Output of the call to getCredibleIntervals.

gD

Output of the call to gelmanDiagnostics.

covariates

Data frame containing input covariate values (possibly Z-normalized).

source.area

Vector of source area values.

outMCMC

Object of class mcmcSampler returned by the call to runMCMC.

Moreover, arguments ll.type (possibly changed to "custom" if a custom likelihood is specified), no.det and data are added to the list.

Examples

data(wigger)
data(dataC)
data(dataRead)

# reduce number of iterations for illustrative purposes 
# (use default mcmc.settings to ensure convergence)
settings.short <- list(iterations = 1e3, thin = 10)
set.seed(1)
out <- run_eDITH_BT(dataC, wigger, mcmc.settings = settings.short)


library(rivnet)
# best-fit (maximum a posteriori) map of eDNA production rates
plot(wigger, out$p_map)

# best-fit map (maximum a posteriori) of detection probability
plot(wigger, out$probDet_map)


# compare best-fit vs observed eDNA concentrations
plot(out$C_map[dataC$ID], dataC$values,
	xlab="Modelled (MAP) concentrations", ylab="Observed concentrations")
abline(a=0, b=1) 

## fit eDNA read number data - use AEMs as covariates
out <- run_eDITH_BT(dataRead, wigger, ll.type = "nbinom",
	par.AEM = list(weight = "gravity"),
	mcmc.settings = settings.short) # use default mcmc.settings to ensure convergence

## use user-defined covariates
covariates <- data.frame(urban = wigger$SC$locCov$landcover_1,
                         agriculture = wigger$SC$locCov$landcover_2,
                         forest = wigger$SC$locCov$landcover_3,
                         elev = wigger$AG$Z,
                         log_drainageArea = log(wigger$AG$A))
						 
out.cov <- 	run_eDITH_BT(dataC, wigger, covariates, 
	mcmc.settings = settings.short) # use default mcmc.settings to ensure convergence

# use user-defined covariates and AEMs
out.covAEM <- 	run_eDITH_BT(dataC, wigger, covariates, 
	use.AEM = TRUE, par.AEM = list(weight = "gravity"),
	mcmc.settings = settings.short) # use default mcmc.settings to ensure convergence	

# use AEMs with significantly positive spatial autocorrelation
out.AEM.moran <- run_eDITH_BT(dataC, wigger, use.AEM = TRUE,
	par.AEM = list(weight = "gravity", moranI = TRUE), 
	mcmc.settings = settings.short) # use default mcmc.settings to ensure convergence 

## use posterior sample to specify user-defined prior
library(BayesianTools)
data(outSample)
pp <- createPriorDensity(outSample$outMCMC)
# Important! add parameter names to objects lower, upper
names(pp$lower) <- names(pp$upper) <- colnames(outSample$outMCMC$chain[[1]])[1:8] 
# the three last columns are for log-posterior, log-likelihood, log-prior 
out.new <- run_eDITH_BT(dataC, wigger, covariates, prior = pp, 
	mcmc.settings = settings.short)
	


[Package eDITH version 1.0.0 Index]