paraest.MCMC {modelSSE} | R Documentation |
To estimate model parameters using Markov chain Monte Carlo approach
Description
This function (i.e., paraest.MCMC()
) performs model parameter estimation using random walk Markov chain Monte Carlo (MCMC) approach with given structured contact tracing data.
Usage
paraest.MCMC(
can.epi.para.start = list(mean = 1, disp = 0.5, shift = 0.2),
isfix.epi.para = list(mean = FALSE, disp = FALSE, shift = FALSE),
offspring.type = "D",
para.comb.num = 10000,
burnin.frac = 0.33,
data = NULL,
var.name = list(obssize = NULL, seedsize = NULL, typelab = NULL),
obs.type.lab = list(offspring = NULL, nextgen = NULL, outbreak = NULL)
)
Arguments
can.epi.para.start |
A list ( |
isfix.epi.para |
A list ( |
offspring.type |
A character label (
By default, |
para.comb.num |
A positive integer for the number of iterations used for MCMC runs.
By default, |
burnin.frac |
A number with range strictly between 0 and 1 for the fraction of MCMC chain that will be discarded as the "burn-in" stage.
By default, |
data |
A data frame ( |
var.name |
A list ( |
obs.type.lab |
A list ( |
Details
For the values of parameters given in can.epi.para.start
,
they are some rough starting points for the MCMC to run, which are not necessarily to be precise (but have to be within a reasonable range), and
they will used as a "start" status to find the posterior MCMC samples.
When obs.type.lab
is a character, it should be either "offspring"
, "nextgen"
, or "outbreak"
for type of observations.
When obs.type.lab
is a list, this occurs when the contact tracing data has more than one types of observations.
When the contact tracing dataset is offspring case observations, the function arguments data
could be either a vector, or a data frame.
If data
is a vector, it is not necessary to assign any value to var.name
.
If data
is a data frame, it is necessary to identify the variable name of offspring observations in var.name
.
When the contact tracing dataset is next-generation cluster size, or final outbreak size observations, the variable names of both observations and seed case size should be identified in var.name
with the format of list(obssize = ?, seedsize = ?)
.
When the contact tracing dataset has more than one types of observations, the variable names of observations, seed case size, and observation type should be identified in var.name
with the format of list(obssize = ?, seedsize = ?, typelab = ?)
.
Value
A list (i.e., list
) contains the following three items:
a data frame (
data.frame
) summaries the median, and 2.5% and 97.5% percentiles (95% credible interval, CrI) of the posterior MCMC samples for each unknown parameters,the maximum log-likelihood value, and
a data frame (
data.frame
) of the posterior MCMC samples and their corresponding log-likelihood values.
Note
Each parameter in can.epi.para.start = list(mean = ?, disp = ?, shift = ?)
should be a scalar, which means vector is not allowed here.
For the contact tracing data in data
, unknown observations (i.e., NA
) is not allowed.
As para.comb.num
is the number of iterations for the MCMC runs, when para.comb.num
is large, e.g., para.comb.num
> 100000, the function paraest.MCMC()
could take few seconds, or even minutes to complete, depending on the sample size, etc.
Thus, we do not recommend the users to change the default setting of para.comb.num
unless for special reasons.
References
Blumberg S, Funk S, Pulliam JR. Detecting differential transmissibilities that affect the size of self-limited outbreaks. PLoS Pathogens. 2014;10(10):e1004452. doi:10.1371/journal.ppat.1004452
Kucharski AJ, Althaus CL. The role of superspreading in Middle East respiratory syndrome coronavirus (MERS-CoV) transmission. Eurosurveillance. 2015;20(25):21167. doi:10.2807/1560-7917.ES2015.20.25.21167
Adam DC, Wu P, Wong JY, Lau EH, Tsang TK, Cauchemez S, Leung GM, Cowling BJ. Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong. Nature Medicine. 2020;26(11):1714-1719. doi:10.1038/s41591-020-1092-0
Zhao S, Chong MK, Ryu S, Guo Z, He M, Chen B, Musa SS, Wang J, Wu Y, He D, Wang MH. Characterizing superspreading potential of infectious disease: Decomposition of individual transmissibility. PLoS Computational Biology. 2022;18(6):e1010281. doi:10.1371/journal.pcbi.1010281
See Also
Examples
# example 1: for offspring observations #
## reproducing the parameter estimation results in Adam, et al. (2020)
## paper doi link: https://doi.org/10.1038/s41591-020-1092-0,
## (see the first row in Supplementary Table 4),
## where R of 0.58 (95% CI: 0.45, 0.72), and k of 0.43 (95% CI: 0.29, 0.67).
data(COVID19_JanApr2020_HongKong)
set.seed(2023)
MCMC.output = paraest.MCMC(
#can.epi.para.start = list(mean = 0.60, disp = 0.50, shift = 0.20),
offspring.type = "NB", para.comb.num = 10000,
data = COVID19_JanApr2020_HongKong$obs,
obs.type.lab = 'offspring'
)
print(MCMC.output$epi.para.est.output)
## Then, plot the posterior fitting results using 100 randomly-selected MCMC samples.
hist(
COVID19_JanApr2020_HongKong$obs, breaks = c(0:100) -0.5, xlim = c(0,12),
freq = FALSE, xlab = 'secondary cases', ylab = 'rel. freq.', main = ''
)
for(jj in 1:100){
temp.random.col.index = sample.int(n = nrow(MCMC.output$est.record.mat), size = 1)
est.record.array = MCMC.output$est.record.mat[temp.random.col.index,]
lines(0:12, d_offspringdistn(
x = 0:12,
epi.para = list(
mean = est.record.array$epi.para.mean,
disp = est.record.array$epi.para.disp, shift = 0.1
),
offspring.type = "NB"
), col = '#0066FF33', type = 'l', lty = 3)
}
# example 2: for offspring observations #
## reproducing the parameter estimation results in Zhao, et al. (2020)
## paper doi link: https://doi.org/10.1371/journal.pcbi.1010281,
## (see the results of dataset #3 using Delaporte distribution in Table 1), where
## R of 0.59 (95% CI: 0.46, 0.78),
## k of 0.16 (95% CI: 0.06, 0.40), and
## shift of 0.17 (95% CI: 0.04, 0.30).
data(COVID19_JanApr2020_HongKong)
set.seed(2023)
paraest.MCMC(
can.epi.para.start = list(mean = 1, disp = 0.5, shift = 0.2),
offspring.type = "D",
data = COVID19_JanApr2020_HongKong$obs,
obs.type.lab = 'offspring'
)$epi.para.est.output
# example 3: for next-generation cluster size observations #
## reproducing the parameter estimation results in Blumberg, et al, (2014)
## paper doi link: https://doi.org/10.1371/journal.ppat.1004452,
## (see the last row in Table 3, and Fig 4A),
## where R of 3.14 (95% CI: 2, >6), and k of 0.37 (95% CI: not reported).
data(smallpox_19581973_Europe)
set.seed(2023)
paraest.MCMC(
can.epi.para.start = list(mean = 1, disp = 0.5, shift = 0.2),
offspring.type = "NB",
data = smallpox_19581973_Europe,
var.name = list(obssize = 'obs.clustersize', seedsize = 'obs.seed'),
obs.type.lab = 'nextgen'
)$epi.para.est.output
# example 4: final outbreak size observations #
## reproducing the parameter estimation results in Kucharski, Althaus. (2015)
## paper doi link: https://doi.org/10.2807/1560-7917.ES2015.20.25.21167,
## (see Fig 1, and Finding section),
## where R of 0.47 (95% CI: 0.29, 0.80), and k of 0.26 (95% CI: 0.09, 1.24).
data(MERS_2013_MEregion)
set.seed(2023)
paraest.MCMC(
can.epi.para.start = list(mean = 1, disp = 0.5, shift = 0.2),
offspring.type = "NB",
data = MERS_2013_MEregion,
var.name = list(obssize = 'obs.finalsize', seedsize = 'obs.seed'),
obs.type.lab = 'outbreak'
)$epi.para.est.output
# example 5: for more than one types of observations #
## reproducing the parameter estimation results in Blumberg, et al, (2014)
## paper doi link: https://doi.org/10.1371/journal.ppat.1004452,
## (see the last row in Table 5, and Fig 6A),
## where R of 0.3 (95% CI: 0.2, 0.5), and k of 0.4 (95% CI: not reported).
data(mpox_19801984_DRC)
set.seed(2023)
paraest.MCMC(
can.epi.para.start = list(mean = 1, disp = 0.5, shift = 0.2),
offspring.type = "NB", para.comb.num = 30000,
data = mpox_19801984_DRC,
var.name = list(obssize = 'obs.size', seedsize = 'obs.seed', typelab = 'type'),
obs.type.lab = list(offspring = 'offspring', nextgen = 'nextgen', outbreak = 'outbreak')
)$epi.para.est.output