estimR {EpiLPS} | R Documentation |
Estimation of the reproduction number with Laplacian-P-splines
Description
This routine estimates the instantaneous reproduction number R_t
; the mean number
of secondary infections generated by an infected individual at time t
(White et
al. 2020); by using Bayesian P-splines and Laplace approximations (Gressani et al. 2022).
Estimation of R_t
is based on a time series of incidence counts and (a discretized) serial
interval distribution. The negative binomial distribution is used to model incidence
count data and P-splines (Eilers and Marx, 1996) are used to smooth the epidemic curve. The link
between the epidemic curve and the reproduction number is established via the renewal equation.
Usage
estimR(incidence, si, K = 30, dates = NULL, maxmethod = c("NelderMead","HillClimb"),
CoriR = FALSE, WTR = FALSE, optimstep = 0.3, priors = Rmodelpriors())
Arguments
incidence |
A vector containing the incidence time series. If |
si |
The (discrete) serial interval distribution. |
K |
Number of B-splines in the basis. |
dates |
A vector of dates in format "YYYY-MM-DD" (optional). |
maxmethod |
The method to maximize the hyperparameter posterior distribution. |
CoriR |
Should the |
WTR |
Should the |
optimstep |
Learning rate for the "HillClimb" method to maximize the posterior distribution of the hyperparameters. |
priors |
A list containing the prior specification of the model hyperparameters as set in Rmodelpriors. See ?Rmodelpriors. |
Details
The estimR
routine estimates the reproduction number in a
totally "sampling-free" fashion. The hyperparameter vector (containing the
penalty parameter of the P-spline model and the overdispersion parameter of
the negative binomial model for the incidence time series) is fixed at its
maximum a posteriori (MAP). By default, the algorithm for maximization is
the one of Nelder and Mead (1965). If maxmethod
is set to "HillClimb",
then a gradient ascent algorithm is used to maximize the hyperparameter posterior.
Value
A list with the following components:
incidence: The incidence time series.
si: The serial interval distribution.
RLPS: A data frame containing estimates of the reproduction number obtained with the Laplacian-P-splines methodology.
thetahat: The estimated vector of B-spline coefficients.
Sighat: The estimated variance-covariance matrix of the Laplace approximation to the conditional posterior distribution of the B-spline coefficients.
RCori: A data frame containing the estimates of the reproduction obtained with the method of Cori et al. (2013).
RWT: A data frame containing the estimates of the reproduction obtained with the method of Wallinga-Teunis (2004).
LPS_elapsed: The routine real elapsed time (in seconds) when estimation of the reproduction number is carried out with Laplacian-P-splines.
Cori_elapsed: The routine real elapsed time (in seconds) when estimation of the reproduction number is carried out with the method of Cori et al. (2013).
penparam: The estimated penalty parameter related to the P-spline model.
K: The number of B-splines used in the basis.
NegBinoverdisp: The estimated overdispersion parameter of the negative binomial distribution for the incidence time series.
optimconverged: Indicates whether the algorithm to maximize the posterior distribution of the hyperparameters has converged.
method: The method to estimate the reproduction number with Laplacian-P-splines.
optim_method: The chosen method to to maximize the posterior distribution of the hyperparameters.
Author(s)
Oswaldo Gressani oswaldo_gressani@hotmail.fr
References
Gressani, O., Wallinga, J., Althaus, C. L., Hens, N. and Faes, C. (2022). EpiLPS: A fast and flexible Bayesian tool for estimation of the time-varying reproduction number. Plos Computational Biology, 18(10): e1010618.
Cori, A., Ferguson, N.M., Fraser, C., Cauchemez, S. (2013). A new framework and software to estimate time-varying reproduction numbers during epidemics. American Journal of Epidemiology, 178(9):1505–1512.
Wallinga, J., & Teunis, P. (2004). Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160(6), 509-516.
White, L.F., Moser, C.B., Thompson, R.N., Pagano, M. (2021). Statistical estimation of the reproductive number from case notification data. American Journal of Epidemiology, 190(4):611-620.
Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2):89-121.
Examples
# Illustration on simulated data
si <- Idist(mean = 5, sd = 3)$pvec
datasim <- episim(si = si, endepi = 60, Rpattern = 5, dist="negbin", overdisp = 50)
epifit_sim <- estimR(incidence = datasim$y, si = si, CoriR = TRUE)
plot(epifit_sim, addfit = "Cori")
# Illustration on the 2003 SARS epidemic in Hong Kong.
data(sars2003)
epifit_sars <- estimR(incidence = sars2003$incidence, si = sars2003$si, K = 40)
tail(epifit_sars$RLPS)
summary(epifit_sars)
plot(epifit_sars)