R: Markov Chain Monte Carlo-based tool for analyzing (SIR/SINR)...

epictmcmc {EpiILMCT}

R Documentation

Markov Chain Monte Carlo-based tool for analyzing (SIR/SINR) distance-based/network-based individual-level disease data.

Description

Runs a Bayesian data augmented MCMC algorithm for fitting specified models (SIR or SINR) with three possible assumptions regarding the data set: 1. a completely observed epidemic; 2. unknown infection times but known removal times; or 3. unknown removal and infection times but known notification times (SINR only).

Usage


epictmcmc(object, distancekernel = NULL, datatype, blockupdate = NULL, nsim,

nchains = NULL, control.sus = NULL, control.trans = NULL, kernel.par = NULL,

spark.par = NULL, delta = NULL, gamma.par = NULL, periodproposal = NULL,

parallel = FALSE)

Arguments

`object`	an object of class “datagen” that can be the output of `datagen` or `as.epidat` functions.
`distancekernel`	the spatial kernel type when `kerneltype` is set to “distance” or “both”. Choices are “powerlaw” for a power-law distance kernel or “Cauchy” for a Cauchy distance kernel.
`datatype`	the ILM data set is analyzed based on three possible assumptions. The possible assumptions are “known epidemic” for a completely observed epidemic; “known removal” times for observing only the removal times for the SIR model, or removal and notified times for the SINR model; or “unknown removal” times for observing only the notified times which is applicable only when fitting the SINR model.
`blockupdate`	a vector of the number of initially observed infection times and the size of blocks for updating the infection times/infectious periods (SIR model), or infection times/incubation periods and removal times/delay periods (SINR model). The default is observing only the first infection time and performing the MCMC updating infection time parameters via single parameter updates.
`nsim`	the number of MCMC iterations to be carried out for a single chain.
`nchains`	the number of MCMC chains to be run. Default is 1 (a single MCMC chain).
`control.sus`	a list of arguments for updating the parameters of the susceptibility function (>0): 1st: the susceptibility parameters (list): a vector (of length `n_{s}`) or a `n_{s} \times \code{nchains}` matrix of initial values of the susceptibility parameters, a `n_{s} \times 4` matrix of the required arguments for updating the susceptibility parameters, 2nd: an `n \times n_{s}` matrix of the susceptibility covariates. 3rd: the power parameters of the susceptibility function (list): a vector (of length `n_{s}`) or a `n_{s} \times \code{nchains}` matrix of initial values of the power parameters, a `n_{s} \times 4` matrix of the required arguments for updating the power parameters, where `n` and `n_s` are the number of individuals and number of susceptibility parameters, respectively. Default = NULL means the model does not include these parameters. See details for a description of required arguments.
`control.trans`	a list of arguments with the same structure as `control.sus`, but for the transmissibility function (>0).
`kernel.par`	a list of arguments for updating the spatial parameter of the distance-based kernel. The first argument is a vector of initial values of the spatial parameter when `kerneltype` is set to “distance” (>0) or a `2 \times \code{nchains}` matrix of initial values of the spatial and network effect parameters when `kerneltype` is set to “both” (>0). The second argument is a vector of arguments for updating the spatial parameter of the distance-based kernel when `kerneltype` is set to “distance” or a `2 \times 4` matrix of arguments for updating the spatial and network effect parameters when `kerneltype` is set to “both”. Default = NULL means the model does not include these parameters (i.e., we are fitting a network-based ILM). See details for a description of required arguments.
`spark.par`	a list of arguments for updating the spark parameter (>=0). The first argument is a vector of the initial values. The second argument is a vector of the required arguments for updating the spark parameter. It should be defined if `kerneltype` is set to “network” with `datatype`= “known removal” or “unknown removal”. Default = NULL means the model does not include this parameter. See details for a description of required arguments.
`delta`	a list of arguments for updating the infectious period rate for an SIR model, or for updating the incubation and delay periods rates for an SINR model (>0). The arguments for updating each rate are a vector of the fixed and known shape parameter(s) for the distribution of the period type, a vector (matrix) of initial values of the rate(s) of the infectious, incubation and/or delay period distributions, and a vector (matrix) of the shape and rate parameters of the gamma prior distribution for the rate parameter(s). Default = NULL means these parameters do not need to be updated, which is the case when `datatype` is set to “known epidemic”.
`gamma.par`	a list of arguments for updating the notification effect parameter (SINR) (>=0). The first argument is a vector of initial values. The second argument is a vector of the required arguments for updating the notification effect parameter. Default = NULL means the parameter is not updated and is assigned a value equal to 1. See details for a description of required arguments.
`periodproposal`	a vector/matrix of the proposal distribution parameters of the independence sampler for updating the infectious period (SIR model), or the incubation and/or delay periods (SINR model). Here, we use a gamma proposal distribution. It is required when `datatype` is set to “known removal” or “unknown removal”. Default = NULL means the parameters of the gamma proposal distribution are the known shape and updated rate values based on the argument `delta`.
`parallel`	if set to “TRUE”, multiple MCMC chains are running in parallel. Default is “FALSE” means chains (>1) are run sequentially.

Details

Bayesian MCMC is performed to obtain posterior estimates of the model parameters and latent variables. When the datatype is set to “known removal” or “unknown removal”, we assume the infectious periods (SIR model) or the incubation and/or delay periods (SINR model) follow gamma distributions with fixed shape and unknown rate parameters. We assign a gamma prior distribution for the rate parameter with shape a and rate b. This leads the rate parameter(s) to have a gamma conditional distribution.

Under the SIR model, the conditional distribution is:

\delta|\theta,\boldsymbol{I},\boldsymbol{R} \sim \Gamma(m+a_{\delta},M+b_{\delta}),

where \delta is the rate of the infectious period distribution; m is the number of infected individuals; M=\sum_{i=1}^{m}{(R_{i}-I_{i})}; and a_{\delta} and b_{\delta} are the prior parameters for the infectious period rate, respectively.

Under the SINR model, the conditional distribution is:

\delta^{(inc)}|\theta,\boldsymbol{I},\boldsymbol{N},\boldsymbol{R} \sim \Gamma(m+a_{\delta^{(inc)}},M_{inc}+b_{\delta^{(inc)}}),

where \delta^{(inc)} is the rate of the incubation period distribution; M_{inc}=\sum_{i=1}^{m}{(N_{i}-I_{i})}; and a_{\delta^{(inc)}} and b_{\delta^{(inc)}} are the prior parameters for the incubation period rate, respectively; and

\delta^{(delay)}|\theta,\boldsymbol{I},\boldsymbol{N},\boldsymbol{R} \sim \Gamma(m+a_{\delta^{(delay)}},M_{delay}+b_{\delta^{(delay)}}),

where \delta^{(delay)} is the rate parameter of the delay period distribution; M_{delay}=\sum_{i=1}^{m}{(R_{i}-N_{i})}; and a_{\delta^{(delay)}} and b_{\delta^{(delay)}} are the prior parameters for the delay period rate, respectively.

A Gibbs update (i.e., sampling from the conditional posterior distribution) is then used for the infectious period (SIR model) or the incubation and/or delay period rates (SINR model). The required information for each period distribution are entered via the delta argument. We assume each period type follows a gamma distribution with fixed shape and unknown rate. Thus, to update the rate parameter of each period, delta is a list containing a vector of the fixed shape value(s), a vector (matrix) of the initial values of the rate(s), and a vector (matrix) of the parameters of the prior distribution of the rate parameter(s). In the case of incubation and delay periods being estimated, the input of the initial values is a 2 \times \code{nchains} matrix, and the prior parameters are given by a 2 \times 2 matrix in which each row contains the required information for each period rate.

An independence sampler is used for updating the infection times and infectious periods (SIR model), or the infection and/or removal times, and the incubation and/or delay periods (SINR model). Under the SIR model, we update each infection time I_{i} by proposing infectious period D^{*}_{i} from a proposal distribution with tuning parameter, such that D_{i}\sim f(.). Then, the new infection time is just the difference between the fixed known removal time and the new infectious period of the i^{th} individual. Each infection time/infectious period is updated in this way in turn. The same procedure is applied for updating the missing event times and incubation and delay periods for the SINR model, with their corresponding parameters.

The control.sus and control.trans arguments provide all the information needed for the susceptibility and transmissibility functions. These arguments must be defined as a list of three objects. The first object is a list of: 1) a vector or matrix of initial values; and 2) a vector or matrix of the following arguments in order: initial value, prior distribution ("gamma", (positive) "half normal" or "uniform"), a vector of the prior distribution parameters, and the proposal variance. The second object is a matrix of the susceptibility or transmissibility risk factors. The third contains the power parameters of the susceptibility or transmissibility functions and has the same structure as the first object. The other model parameters kernel.par, spark.par and gamma.par also have the same structure as the first object. These model parameters, along with the model parameters of the susceptibility and transmissibility functions (control.sus and control.trans), are updated in turn using a random-walk Metropolis Hastings algorithm with normal proposals. The normal proposals must be tuned, via the proposal variance, by the user to achieve good mixing properties.

Note that, setting the variance of the normal proposal distribution to zero fixes a parameter at its initial value. This option allows the user to fix such a parameter in the MCMC procedure while updating others.

For faster run of the epictmcmc function, an option of running multiple chains on parallel can be performed, controlled via the two options nchains and parallel. If parallel is set to TRUE, the number of required cores is set to the minimum of nchains and the number of available cores on the user's computer.

Value

Returns an object of class epictmcmc that contains:

compart.framework:: the compartmental framework model used in the analysis.
kernel.type:: the used kernel.type in the function.
data.assumption:: the datatype.
parameter.samples:: the MCMC output of the updated model parameters.
log.posterior:: the log posterior densities.
acceptance.rate:: the acceptance rates for MCMC updates of each parameter.
number.iteration:: the number of iterations.
number.parameter:: the number of the unknown model parameters.
number.chains:: the number of the MCMC chains.
infection.times.samples:: the updated infection times when datatype is set to “known removal”.
Average.infectious.periods:: the average infectious period when type is set to “SIR” and datatype is set to “known removal”.
removal.times.samples:: the updated removal times when datatype is set to “unknown removal”.
Average.incubation.periods:: the average incubation period when type is set to “SINR” and datatype is set to either “known removal” or “unknown removal”.
Average.delay.periods:: the average delay period when type is set to “SINR” and datatype is set to “unknown removal”.

Examples

## This is for testing; the number of MCMC iterations needs to be
## mucher higher in practice.

## distance-based SIR continuous-time ILMs ##
data(SpatialData)
## performing the MCMC-tool for analyzing the fully observed spatial data
## under the SIR distance-based continuous ILM:
suspar <- list(NULL)
suspar[[1]] <- list(2, c("gamma", 1, 0.01, 0.5))
suspar[[2]] <- rep(1, length(SpatialData$epidat[,1]))
kernel1 <- list(2, c("gamma", 1, 0.01, 0.5))
mcmcres2 <- epictmcmc(object = SpatialData,
distancekernel = "powerlaw", datatype = "known epidemic", nsim = 50,
control.sus = suspar, kernel.par = kernel1)
summary(mcmcres2)

## performing the MCMC-tool for analyzing the partially observed spatial
## data (unknown infection times) under the SIR distance-based
## continuous ILM:
suspar <- list(NULL)
suspar[[1]]<-list(2,c("gamma", 1, 0.01, 0.8))
suspar[[2]]<- rep(1,length(SpatialData$epidat[,1]))
kernel1 <- list(2, c("gamma", 1, 0.01, 0.5))
mcmcres22 <- epictmcmc(object = SpatialData,
distancekernel = "powerlaw", datatype = "known removal", nsim = 50,
control.sus = suspar, kernel.par = kernel1, delta = list(1,2,c(4,2)))
summary(mcmcres22)

## distance-based and network-based SIR ILMs ##
data(SpatialNetData)
## performing the MCMC-tool for analyzing the fully observed spatial and
## network data under the SIR distance-based and network-based
## continuous-time ILM:
suspar <- list(NULL)
suspar[[1]] <- list(NULL)
suspar[[1]][[1]] <- c(0.08, 0.2)
suspar[[1]][[2]] <- matrix(c("gamma", "gamma", 1, 1, 0.01, 0.01, 0.1, 0.5),
ncol = 4, nrow = 2)
suspar[[2]] <- SpatialNetData[[2]]
kernel1 <- list(NULL)
kernel1[[1]] <- c(5, 0.5)
kernel1[[2]] <- matrix(c("gamma", "gamma", 1, 1, 0.01, 0.01, 0.5, 1),
ncol = 4, nrow = 2)
mcmcres3 <- epictmcmc(object = SpatialNetData[[1]],
distancekernel = "powerlaw", datatype = "known epidemic", nsim = 50,
control.sus = suspar, kernel.par = kernel1)
summary(mcmcres3)

## network-based SIR ILMs ##
data(NetworkData)
## performing the MCMC-tool for analyzing the fully observed network data
## under the SIR network-based continuous-time ILM:
suspar <- list(NULL)
suspar[[1]] <- list(NULL)
suspar[[1]][[1]] <- c(0.08, 0.5)
suspar[[1]][[2]] <- matrix(c("gamma", "gamma", 1, 1, 1, 1, 0.1, 0.5),
ncol = 4, nrow = 2)
suspar[[2]] <- NetworkData[[2]]
mcmcres4 <- epictmcmc(object = NetworkData[[1]], datatype = "known epidemic",
nsim = 50, control.sus = suspar)
summary(mcmcres4)

[Package EpiILMCT version 1.1.7 Index]