epictmcmc {EpiILMCT} | R Documentation |
Markov Chain Monte Carlo-based tool for analyzing (SIR/SINR) distance-based/network-based individual-level disease data.
Description
Runs a Bayesian data augmented MCMC algorithm for fitting specified models (SIR or SINR) with three possible assumptions regarding the data set: 1. a completely observed epidemic; 2. unknown infection times but known removal times; or 3. unknown removal and infection times but known notification times (SINR only).
Usage
epictmcmc(object, distancekernel = NULL, datatype, blockupdate = NULL, nsim,
nchains = NULL, control.sus = NULL, control.trans = NULL, kernel.par = NULL,
spark.par = NULL, delta = NULL, gamma.par = NULL, periodproposal = NULL,
parallel = FALSE)
Arguments
object |
an object of class “datagen” that can be the output of |
distancekernel |
the spatial kernel type when |
datatype |
the ILM data set is analyzed based on three possible assumptions. The possible assumptions are “known epidemic” for a completely observed epidemic; “known removal” times for observing only the removal times for the SIR model, or removal and notified times for the SINR model; or “unknown removal” times for observing only the notified times which is applicable only when fitting the SINR model. |
blockupdate |
a vector of the number of initially observed infection times and the size of blocks for updating the infection times/infectious periods (SIR model), or infection times/incubation periods and removal times/delay periods (SINR model). The default is observing only the first infection time and performing the MCMC updating infection time parameters via single parameter updates. |
nsim |
the number of MCMC iterations to be carried out for a single chain. |
nchains |
the number of MCMC chains to be run. Default is 1 (a single MCMC chain). |
control.sus |
a list of arguments for updating the parameters of the susceptibility function (>0):
where |
control.trans |
a list of arguments with the same structure as |
kernel.par |
a list of arguments for updating the spatial parameter of the distance-based kernel. The first argument is a vector of initial values of the spatial parameter when |
spark.par |
a list of arguments for updating the spark parameter (>=0). The first argument is a vector of the initial values. The second argument is a vector of the required arguments for updating the spark parameter. It should be defined if |
delta |
a list of arguments for updating the infectious period rate for an SIR model, or for updating the incubation and delay periods rates for an SINR model (>0). The arguments for updating each rate are a vector of the fixed and known shape parameter(s) for the distribution of the period type, a vector (matrix) of initial values of the rate(s) of the infectious, incubation and/or delay period distributions, and a vector (matrix) of the shape and rate parameters of the gamma prior distribution for the rate parameter(s). Default = NULL means these parameters do not need to be updated, which is the case when |
gamma.par |
a list of arguments for updating the notification effect parameter (SINR) (>=0). The first argument is a vector of initial values. The second argument is a vector of the required arguments for updating the notification effect parameter. Default = NULL means the parameter is not updated and is assigned a value equal to 1. See details for a description of required arguments. |
periodproposal |
a vector/matrix of the proposal distribution parameters of the independence sampler for updating the infectious period (SIR model), or the incubation and/or delay periods (SINR model). Here, we use a gamma proposal distribution. It is required when |
parallel |
if set to “TRUE”, multiple MCMC chains are running in parallel. Default is “FALSE” means chains (>1) are run sequentially. |
Details
Bayesian MCMC is performed to obtain posterior estimates of the model parameters and latent variables. When the datatype
is set to “known removal” or “unknown removal”, we assume the infectious periods (SIR model) or the incubation and/or delay periods (SINR model) follow gamma distributions with fixed shape and unknown rate parameters. We assign a gamma prior distribution for the rate parameter with shape a
and rate b
. This leads the rate parameter(s) to have a gamma conditional distribution.
Under the SIR model, the conditional distribution is:
\delta|\theta,\boldsymbol{I},\boldsymbol{R} \sim \Gamma(m+a_{\delta},M+b_{\delta}),
where \delta
is the rate of the infectious period distribution; m
is the number of infected individuals; M=\sum_{i=1}^{m}{(R_{i}-I_{i})}
; and a_{\delta}
and b_{\delta}
are the prior parameters for the infectious period rate, respectively.
Under the SINR model, the conditional distribution is:
\delta^{(inc)}|\theta,\boldsymbol{I},\boldsymbol{N},\boldsymbol{R} \sim \Gamma(m+a_{\delta^{(inc)}},M_{inc}+b_{\delta^{(inc)}}),
where \delta^{(inc)}
is the rate of the incubation period distribution; M_{inc}=\sum_{i=1}^{m}{(N_{i}-I_{i})}
; and a_{\delta^{(inc)}}
and b_{\delta^{(inc)}}
are the prior parameters for the incubation period rate, respectively; and
\delta^{(delay)}|\theta,\boldsymbol{I},\boldsymbol{N},\boldsymbol{R} \sim \Gamma(m+a_{\delta^{(delay)}},M_{delay}+b_{\delta^{(delay)}}),
where \delta^{(delay)}
is the rate parameter of the delay period distribution; M_{delay}=\sum_{i=1}^{m}{(R_{i}-N_{i})}
; and a_{\delta^{(delay)}}
and b_{\delta^{(delay)}}
are the prior parameters for the delay period rate, respectively.
A Gibbs update (i.e., sampling from the conditional posterior distribution) is then used for the infectious period (SIR model) or the incubation and/or delay period rates (SINR model). The required information for each period distribution are entered via the delta
argument. We assume each period type follows a gamma distribution with fixed shape and unknown rate. Thus, to update the rate parameter of each period, delta
is a list containing a vector of the fixed shape value(s), a vector (matrix) of the initial values of the rate(s), and a vector (matrix) of the parameters of the prior distribution of the rate parameter(s). In the case of incubation and delay periods being estimated, the input of the initial values is a 2 \times \code{nchains}
matrix, and the prior parameters are given by a 2 \times 2
matrix in which each row contains the required information for each period rate.
An independence sampler is used for updating the infection times and infectious periods (SIR model), or the infection and/or removal times, and the incubation and/or delay periods (SINR model). Under the SIR model, we update each infection time I_{i}
by proposing infectious period D^{*}_{i}
from a proposal distribution with tuning parameter, such that D_{i}\sim
f(.). Then, the new infection time is just the difference between the fixed known removal time and the new infectious period of the i^{th}
individual. Each infection time/infectious period is updated in this way in turn. The same procedure is applied for updating the missing event times and incubation and delay periods for the SINR model, with their corresponding parameters.
The control.sus
and control.trans
arguments provide all the information needed for the susceptibility and transmissibility functions. These arguments must be defined as a list of three objects. The first object is a list of: 1) a vector or matrix of initial values; and 2) a vector or matrix of the following arguments in order: initial value, prior distribution ("gamma", (positive) "half normal" or "uniform"), a vector of the prior distribution parameters, and the proposal variance. The second object is a matrix of the susceptibility or transmissibility risk factors. The third contains the power parameters of the susceptibility or transmissibility functions and has the same structure as the first object. The other model parameters kernel.par
, spark.par
and gamma.par
also have the same structure as the first object. These model parameters, along with the model parameters of the susceptibility and transmissibility functions (control.sus
and control.trans
), are updated in turn using a random-walk Metropolis Hastings algorithm with normal proposals. The normal proposals must be tuned, via the proposal variance, by the user to achieve good mixing properties.
Note that, setting the variance of the normal proposal distribution to zero fixes a parameter at its initial value. This option allows the user to fix such a parameter in the MCMC procedure while updating others.
For faster run of the epictmcmc
function, an option of running multiple chains on parallel can be performed, controlled via the two options nchains
and parallel
. If parallel
is set to TRUE, the number of required cores is set to the minimum of nchains
and the number of available cores on the user's computer.
Value
Returns an object of class epictmcmc
that contains:
- compart.framework:
the compartmental framework model used in the analysis.
- kernel.type:
the used
kernel.type
in the function.- data.assumption:
the
datatype
.- parameter.samples:
the MCMC output of the updated model parameters.
- log.posterior:
the log posterior densities.
- acceptance.rate:
the acceptance rates for MCMC updates of each parameter.
- number.iteration:
the number of iterations.
- number.parameter:
the number of the unknown model parameters.
- number.chains:
the number of the MCMC chains.
- infection.times.samples:
the updated infection times when
datatype
is set to “known removal”.- Average.infectious.periods:
the average infectious period when
type
is set to “SIR” anddatatype
is set to “known removal”.- removal.times.samples:
the updated removal times when
datatype
is set to “unknown removal”.- Average.incubation.periods:
the average incubation period when
type
is set to “SINR” anddatatype
is set to either “known removal” or “unknown removal”.- Average.delay.periods:
the average delay period when
type
is set to “SINR” anddatatype
is set to “unknown removal”.
See Also
print.epictmcmc
, summary.epictmcmc
, plot.epictmcmc
, datagen
, loglikelihoodepiILM
.
Examples
## This is for testing; the number of MCMC iterations needs to be
## mucher higher in practice.
## distance-based SIR continuous-time ILMs ##
data(SpatialData)
## performing the MCMC-tool for analyzing the fully observed spatial data
## under the SIR distance-based continuous ILM:
suspar <- list(NULL)
suspar[[1]] <- list(2, c("gamma", 1, 0.01, 0.5))
suspar[[2]] <- rep(1, length(SpatialData$epidat[,1]))
kernel1 <- list(2, c("gamma", 1, 0.01, 0.5))
mcmcres2 <- epictmcmc(object = SpatialData,
distancekernel = "powerlaw", datatype = "known epidemic", nsim = 50,
control.sus = suspar, kernel.par = kernel1)
summary(mcmcres2)
## performing the MCMC-tool for analyzing the partially observed spatial
## data (unknown infection times) under the SIR distance-based
## continuous ILM:
suspar <- list(NULL)
suspar[[1]]<-list(2,c("gamma", 1, 0.01, 0.8))
suspar[[2]]<- rep(1,length(SpatialData$epidat[,1]))
kernel1 <- list(2, c("gamma", 1, 0.01, 0.5))
mcmcres22 <- epictmcmc(object = SpatialData,
distancekernel = "powerlaw", datatype = "known removal", nsim = 50,
control.sus = suspar, kernel.par = kernel1, delta = list(1,2,c(4,2)))
summary(mcmcres22)
## distance-based and network-based SIR ILMs ##
data(SpatialNetData)
## performing the MCMC-tool for analyzing the fully observed spatial and
## network data under the SIR distance-based and network-based
## continuous-time ILM:
suspar <- list(NULL)
suspar[[1]] <- list(NULL)
suspar[[1]][[1]] <- c(0.08, 0.2)
suspar[[1]][[2]] <- matrix(c("gamma", "gamma", 1, 1, 0.01, 0.01, 0.1, 0.5),
ncol = 4, nrow = 2)
suspar[[2]] <- SpatialNetData[[2]]
kernel1 <- list(NULL)
kernel1[[1]] <- c(5, 0.5)
kernel1[[2]] <- matrix(c("gamma", "gamma", 1, 1, 0.01, 0.01, 0.5, 1),
ncol = 4, nrow = 2)
mcmcres3 <- epictmcmc(object = SpatialNetData[[1]],
distancekernel = "powerlaw", datatype = "known epidemic", nsim = 50,
control.sus = suspar, kernel.par = kernel1)
summary(mcmcres3)
## network-based SIR ILMs ##
data(NetworkData)
## performing the MCMC-tool for analyzing the fully observed network data
## under the SIR network-based continuous-time ILM:
suspar <- list(NULL)
suspar[[1]] <- list(NULL)
suspar[[1]][[1]] <- c(0.08, 0.5)
suspar[[1]][[2]] <- matrix(c("gamma", "gamma", 1, 1, 1, 1, 0.1, 0.5),
ncol = 4, nrow = 2)
suspar[[2]] <- NetworkData[[2]]
mcmcres4 <- epictmcmc(object = NetworkData[[1]], datatype = "known epidemic",
nsim = 50, control.sus = suspar)
summary(mcmcres4)