R: Fit incidence curve to reported data

fit_incidence {incidental}

R Documentation

Fit incidence curve to reported data

Description

This is a function that fits an incidence curve to a set of reported cases and delay distribution using an empirical Bayes estimation method, which fits parameters for a spline basis. All hyper parameter tuning and data processing are done within this function.

Usage

fit_incidence(
  reported,
  delay_dist,
  dof_grid = seq(6, 20, 2),
  dof_method = "aic",
  lam_grid = 10^(seq(-1, -8, length.out = 20)),
  lam_method = "val",
  percent_thresh = 2,
  regularization_order = 2,
  num_ar_steps = 10,
  num_ar_samps = 100,
  linear_tail = 14,
  front_pad_size = 10,
  extrapolation_prior_precision = 10,
  frac_train = 0.75,
  fisher_approx_cov = TRUE,
  end_pad_size = 50,
  num_samps_per_ar = 10,
  val_restarts = 2,
  seed = 1
)

Arguments

`reported`	An integer vector of reported cases.
`delay_dist`	A positive vector that sums to one, which describes the delay distribution.
`dof_grid`	An integer vector of degrees of freedom for the spline basis.
`dof_method`	Metric to choose "best" spline degrees of freedom: 'aic': Akaike information criterion, 'bic': Bayesian information criterion, 'val': validation likelihood.
`lam_grid`	A vector of regularization strengths to scan.
`lam_method`	metric to choose "best" regularization strength lambda: 'aic': Akaike information criterion, 'bic': Bayesian information criterion, 'val': validation likelihood.
`percent_thresh`	If using validation likelihood to select best, the largest (strongest) lambda that is within 'percent_thresh' of the highest validation lambda will be selected. Default is 2. Must be greater than 0.
`regularization_order`	An integer (typically 0, 1, 2), indicating differencing order for L2 regularization of spline parameters. Default is 2 for second derivative penalty.
`num_ar_steps`	An integer number of AR steps after last observation.
`num_ar_samps`	An integer number of AR samples.
`linear_tail`	An integer number of days used to fit linear model on tail to be used as a mean for AR extrapolation.
`front_pad_size`	An integer for initial number of 0's before first observation.
`extrapolation_prior_precision`	A positive scalar for extrapolation slope shrinkage prior precision.
`frac_train`	A numeric between 0 and 1 for fraction of data used to train lambda validation.
`fisher_approx_cov`	A flag to use either the Fisher Information (TRUE) or the Hessian (FALSE) to approx posterior covariance over parameters.
`end_pad_size`	And integer number of steps the spline is defined beyond the final observation.
`num_samps_per_ar`	An integer for the number of Laplace samples per AR fit.
`val_restarts`	An integer for the number of times to refit hyperparameters if 'val' is used for either. Set to 1 for faster but more unstable fits.
`seed`	Seed for RNG.

Value

A list with the following entries:

Isamps – sample of the incidence curve from a Laplace approximation per AR sample;
Ihat – MAP incidence curve estimate;
Chat – expected cases given MAP incidence curve estimate;
beta_hats – matrix of beta's per AR sample;
best_dof – best degrees of freedom from tuning;
best_lambda – best regularization parameter from tuning; and
reported – a copy of reported values used for fitting.

Examples


indiana_model <- fit_incidence(
                  reported = spanish_flu$Indiana, 
                  delay_dist = spanish_flu_delay_dist$proportion)

[Package incidental version 0.1 Index]