fit_incidence {incidental} | R Documentation |
Fit incidence curve to reported data
Description
This is a function that fits an incidence curve to a set of reported cases and delay distribution using an empirical Bayes estimation method, which fits parameters for a spline basis. All hyper parameter tuning and data processing are done within this function.
Usage
fit_incidence(
reported,
delay_dist,
dof_grid = seq(6, 20, 2),
dof_method = "aic",
lam_grid = 10^(seq(-1, -8, length.out = 20)),
lam_method = "val",
percent_thresh = 2,
regularization_order = 2,
num_ar_steps = 10,
num_ar_samps = 100,
linear_tail = 14,
front_pad_size = 10,
extrapolation_prior_precision = 10,
frac_train = 0.75,
fisher_approx_cov = TRUE,
end_pad_size = 50,
num_samps_per_ar = 10,
val_restarts = 2,
seed = 1
)
Arguments
reported |
An integer vector of reported cases. |
delay_dist |
A positive vector that sums to one, which describes the delay distribution. |
dof_grid |
An integer vector of degrees of freedom for the spline basis. |
dof_method |
Metric to choose "best" spline degrees of freedom: 'aic': Akaike information criterion, 'bic': Bayesian information criterion, 'val': validation likelihood. |
lam_grid |
A vector of regularization strengths to scan. |
lam_method |
metric to choose "best" regularization strength lambda: 'aic': Akaike information criterion, 'bic': Bayesian information criterion, 'val': validation likelihood. |
percent_thresh |
If using validation likelihood to select best, the largest (strongest) lambda that is within 'percent_thresh' of the highest validation lambda will be selected. Default is 2. Must be greater than 0. |
regularization_order |
An integer (typically 0, 1, 2), indicating differencing order for L2 regularization of spline parameters. Default is 2 for second derivative penalty. |
num_ar_steps |
An integer number of AR steps after last observation. |
num_ar_samps |
An integer number of AR samples. |
linear_tail |
An integer number of days used to fit linear model on tail to be used as a mean for AR extrapolation. |
front_pad_size |
An integer for initial number of 0's before first observation. |
extrapolation_prior_precision |
A positive scalar for extrapolation slope shrinkage prior precision. |
frac_train |
A numeric between 0 and 1 for fraction of data used to train lambda validation. |
fisher_approx_cov |
A flag to use either the Fisher Information (TRUE) or the Hessian (FALSE) to approx posterior covariance over parameters. |
end_pad_size |
And integer number of steps the spline is defined beyond the final observation. |
num_samps_per_ar |
An integer for the number of Laplace samples per AR fit. |
val_restarts |
An integer for the number of times to refit hyperparameters if 'val' is used for either. Set to 1 for faster but more unstable fits. |
seed |
Seed for RNG. |
Value
A list with the following entries:
Isamps – sample of the incidence curve from a Laplace approximation per AR sample;
Ihat – MAP incidence curve estimate;
Chat – expected cases given MAP incidence curve estimate;
beta_hats – matrix of beta's per AR sample;
best_dof – best degrees of freedom from tuning;
best_lambda – best regularization parameter from tuning; and
reported – a copy of reported values used for fitting.
Examples
indiana_model <- fit_incidence(
reported = spanish_flu$Indiana,
delay_dist = spanish_flu_delay_dist$proportion)