R: The likelihood function

overalllikelihood {modelSSE}

R Documentation

The likelihood function

Description

This function (i.e., overalllikelihood()) calculates the likelihood value with a list of pre-defined epidemiological parameters and a given structured contact tracing data.

Usage

overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = "D",
  is.log = TRUE,
  data = NULL,
  var.name = list(obssize = NULL, seedsize = NULL, typelab = NULL),
  obs.type.lab = list(offspring = NULL, nextgen = NULL, outbreak = NULL)
)

Arguments

`epi.para`	A list (`list`) of pre-defined epidemiological parameters for offspring distribution, in the format of `list(mean = ?, disp = ?, shift = ?)`, where the three parameters accept non-negative values. Each parameter must be a scalar. For Delaporte distribution, the value of `mean` should be larger than the value of `shift`.
`offspring.type`	A character label (`character`) indicating the type of distribution used to describe the offspring distribution. It only accepts one of the following values: `"D"` indicates the Delaporte distribution, `"NB"` indicates the negative binomial distribution, `"G"` indicates the geometric distribution, or `"P"` indicates the Poisson distribution. By default, `offspring.type = 'D'`.
`is.log`	A logical variable, under which the likelihood would be taken natural logarithm, i.e., log-likelihood, if `is.log = TRUE`. By default, `is.log = TRUE`.
`data`	A data frame (`data.frame`), or a vector (only when `obs.type.lab = "offspring"`) that contains the structured contact tracing data.
`var.name`	A list (`list`), or a character of variable name for the column names of dataset given in `data`. For a list of variable names, it should be in the format of `list(obssize = ?, seedsize = ?, typelab = ?)`. Please see the details section for more information. By default, `var.name = list(obssize = NULL, seedsize = NULL, typelab = NULL)`.
`obs.type.lab`	A list (`list`), or a character of labels (i.e., "offspring", "nextgen", or "outbreak") for the type of observations. For a list of labels, it should be in the format of `list(offspring = ?, nextgen = ?, outbreak = ?)`. Please see the details section for more information. By default, `obs.type.lab = list(offspring = NULL, nextgen = NULL, outbreak = NULL)`.

Details

When obs.type.lab is a character, it should be either "offspring", "nextgen", or "outbreak" for type of observations.

When obs.type.lab is a list, this occurs when the contact tracing data has more than one types of observations. See example 4 in the Examples section.

When the contact tracing dataset is offspring case observations, the function arguments data could be either a vector, or a data frame. If data is a vector, it is not necessary to assign any value to var.name. If data is a data frame, it is necessary to identify the variable name of offspring observations in var.name. See example 1 in the Examples section.

When the contact tracing dataset is next-generation cluster size, or final outbreak size observations, the variable names of both observations and seed case size should be identified in var.name with the format of list(obssize = ?, seedsize = ?). See example 2 and example 3 in the Examples section.

When the contact tracing dataset has more than one types of observations, the variable names of observations, seed case size, and observation type should be identified in var.name with the format of list(obssize = ?, seedsize = ?, typelab = ?). See example 4 in the Examples section.

Value

The log-likelihood (by default), or likelihood value from contact tracing data, with pre-defined epidemiological parameters.

Note

Each parameter in epi.para = list(mean = ?, disp = ?, shift = ?) should be a scalar, which means vector is not allowed here.

For the contact tracing data in data, unknown observations (i.e., NA) is not allowed.

References

Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438(7066):355-359. doi:10.1038/nature04153

Nishiura H, Yan P, Sleeman CK, Mode CJ. Estimating the transmission potential of supercritical processes based on the final size distribution of minor outbreaks. Journal of Theoretical Biology. 2012;294:48-55. doi:10.1016/j.jtbi.2011.10.039

Blumberg S, Funk S, Pulliam JR. Detecting differential transmissibilities that affect the size of self-limited outbreaks. PLoS Pathogens. 2014;10(10):e1004452. doi:10.1371/journal.ppat.1004452

Kucharski AJ, Althaus CL. The role of superspreading in Middle East respiratory syndrome coronavirus (MERS-CoV) transmission. Eurosurveillance. 2015;20(25):21167. doi:10.2807/1560-7917.ES2015.20.25.21167

Endo A, Abbott S, Kucharski AJ, Funk S. Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China. Wellcome Open Research. 2020;5:67. doi:10.12688/wellcomeopenres.15842.3

Adam DC, Wu P, Wong JY, Lau EH, Tsang TK, Cauchemez S, Leung GM, Cowling BJ. Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong. Nature Medicine. 2020;26(11):1714-1719. doi:10.1038/s41591-020-1092-0

Zhao S, Chong MK, Ryu S, Guo Z, He M, Chen B, Musa SS, Wang J, Wu Y, He D, Wang MH. Characterizing superspreading potential of infectious disease: Decomposition of individual transmissibility. PLoS Computational Biology. 2022;18(6):e1010281. doi:10.1371/journal.pcbi.1010281

Examples


# example 1 #
## likelihood for the offspring observations
data(COVID19_JanApr2020_HongKong)
overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = "D",
  data = COVID19_JanApr2020_HongKong,
  var.name = list(obssize = 'obs'),
  obs.type.lab = 'offspring'
)
overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = "D",
  data = COVID19_JanApr2020_HongKong$obs,
  obs.type.lab = 'offspring'
)


# example 2 #
## likelihood for the next-generation cluster size observations
data(smallpox_19581973_Europe)
overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = 'D',
  data = smallpox_19581973_Europe,
  var.name = list(obssize = 'obs.clustersize', seedsize = 'obs.seed'),
  obs.type.lab = 'nextgen'
)


# example 3 #
## likelihood for the final outbreak size observations
data(MERS_2013_MEregion)
overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = 'D',
  data = MERS_2013_MEregion,
  var.name = list(obssize = 'obs.finalsize', seedsize = 'obs.seed'),
  obs.type.lab = 'outbreak'
)


# example 4 #
## likelihood for more than one types of observations
data(mpox_19801984_DRC)
overalllikelihood(
  epi.para = list(mean = 1, disp = 0.5, shift = 0.2),
  offspring.type = 'D',
  data = mpox_19801984_DRC,
  var.name = list(obssize = 'obs.size', seedsize = 'obs.seed', typelab = 'type'),
  obs.type.lab = list(offspring = 'offspring', nextgen = 'nextgen', outbreak = 'outbreak')
)


# example 5 #
## reproducing the AIC results in Adam, et al. (2020) https://doi.org/10.1038/s41591-020-1092-0,
## (see Supplementary Table 4),
## where the AIC scores were calculated for NB, Geometric, and Poisson models from top to bottom.
## Here, the AIC is defined as: AIC = -2 * log-likelihood + 2 * number of unknown model parameters.
data(COVID19_JanApr2020_HongKong)
overalllikelihood(
  epi.para = list(mean = 0.58, disp = 0.43, shift = 0.2),
  offspring.type = "NB",
  data = COVID19_JanApr2020_HongKong$obs,
  obs.type.lab = 'offspring'
) * (-2) + 2*2
overalllikelihood(
  epi.para = list(mean = 0.63, disp = 0.43, shift = 0.2),
  offspring.type = "G",
  data = COVID19_JanApr2020_HongKong$obs,
  obs.type.lab = 'offspring'
) * (-2) + 1*2
overalllikelihood(
  epi.para = list(mean = 0.58, disp = 0.43, shift = 0.2),
  offspring.type = "P",
  data = COVID19_JanApr2020_HongKong$obs,
  obs.type.lab = 'offspring'
) * (-2) + 1*2

[Package modelSSE version 0.1-3 Index]