excess_model {excessmort} | R Documentation |
Fit excess count model
Description
This function estimates the increase in the rate for a count time series relative to the rate for a typical year. Two options are available: 1 - model the rate increase as a smooth function and estimate this function or 2 - estimate the total excess in intervals. For 1 an 'event' date can be provided and a discontinuity included in the model. You can do either 1 or 2 or both.
Usage
excess_model(
counts,
start = NULL,
end = NULL,
knots.per.year = 12,
event = NULL,
intervals = NULL,
discontinuity = TRUE,
model = c("quasipoisson", "poisson", "correlated"),
exclude = NULL,
include.trend = TRUE,
trend.knots.per.year = 1/7,
harmonics = 2,
frequency = NULL,
weekday.effect = FALSE,
control.dates = NULL,
max.control = 5000,
order.max = 14,
aic = TRUE,
maxit = 25,
epsilon = 1e-08,
alpha = 0.05,
min.rate = 1e-04,
keep.counts = FALSE,
keep.components = TRUE,
verbose = TRUE
)
Arguments
counts |
A data frame with date, count and population columns. |
start |
First day of interval to which model will be fit |
end |
Last day of interval to which model will be fit |
knots.per.year |
Number of knots per year used for the fitted smooth function |
event |
If modeling a discontinuity is desired, this is the day in which it happens |
intervals |
Instead of 'start' and 'end' a list of time intervals can be provided and excess is computed in each one |
discontinuity |
Logical that determines if discontinuity is allowed at 'event' |
model |
Which version of the model to fit |
exclude |
Dates to exclude when computing expected counts |
include.trend |
Logical that determines if a slow trend is included in the model. |
trend.knots.per.year |
Number of knots per year used by 'compute_expected' to estimate the trend for the expected counts |
harmonics |
Number of harmonics used by 'compute_expected' to estimate seasonal trend |
frequency |
Number of observations per year. If not provided an attempt is made to calculate it |
weekday.effect |
Logical that determins if a day of the week effects is included in the model. Should be 'FALSE' for weekly or monthly data. |
control.dates |
When 'model' is set to 'correlated', these dates are used to estimate the covariance matrix. The larger this is the slower the function runs. |
max.control |
If the length of 'control.dates' is larger than 'max.control' the function stops. |
order.max |
Larges order for the Autoregressive process used to model the covariance structure |
aic |
A logical that determines if the AIC criterion is used to selected the order of the AR process |
maxit |
Maxium number of iterations for the IRLS algorithm used when 'model' is 'correlated' |
epsilon |
Difference in deviance requried to declare covergenace of IRLS |
alpha |
Percentile used to define what is outside the normal range |
min.rate |
The estimated expected rate is not permited to go below this value |
keep.counts |
A logical that if 'TRUE' forces the function to include the data used to fit the expected count model. |
keep.components |
A logical that if 'TRUE' forces the function to return the estimated trend, seasonal model, and weekday effect, if included in the model. Ignored if expected counts already provided or 'keep.counts' is 'FALSE'. |
verbose |
Logical that determines if messages are displayed |
Details
Three versions of the model are available: 1 - Assume counts are Poisson distributed, 2 - assume counts are overdispersed Poisson, or 3 - assume a mixed model with correlated errors. The second is the default and recommended for weekly count data. For daily counts we often find evidence of correlation and recommend the third along with setting 'weekday.effect = TRUE'.
If the 'counts' object includes a 'expected' column produced by 'compute_expected' these are used as the expected counts. If not, then these are computed.
Value
If only 'intervals' are provided a data frame with excess estimates described below for 'excess'. if 'start' and 'end' are provided the a list with the following components are included:
- date
The dates for which the estimate was computed
- observed
The observed counts
- expected
The expected counts
- fitted
The fitted curve for excess counts
- se
The point-wise standard error for the fitted curve
- population
The population size
- sd
The standard deviation for observed counts on a typical year
- cov
The estimated covariance matrix for the observed counts
- x
The design matrix used for the fit
- betacov
The covariance matrix for the estimated coefficients
- dispersion
The estimated overdispersion parameter
- detected_intervals
Time intervals for which the 1 - 'alpha' confidence interval does not include 0
- ar
The estimated coefficients for the autoregressive process
- excess
A data frame with information for the time intervals provided in 'itervals'. This includes start, end, observed death rate (per 1,000 per year), expected death rate, standard deviation for the death rate, observed counts, expected counts, excess counts, standard deviation
Examples
data(cdc_state_counts)
counts <- cdc_state_counts[cdc_state_counts$state == "Massachusetts", ]
exclude_dates <- c(seq(as.Date("2017-12-16"), as.Date("2018-01-16"), by = "day"),
seq(as.Date("2020-01-01"), max(cdc_state_counts$date), by = "day"))
f <- excess_model(counts,
exclude = exclude_dates,
start = min(counts$date),
end = max(counts$date),
knots.per.year = 12)
data(new_jersey_counts)
exclude_dates <- as.Date("2012-10-29") + 0:180
control_dates <- seq(min(new_jersey_counts$date), min(exclude_dates) - 1, by = "day")
f <- excess_model(new_jersey_counts,
start = as.Date("2012-09-01"),
end = as.Date("2013-09-01"),
exclude = exclude_dates,
model = "correlated",
weekday.effect = TRUE,
control.dates = control_dates)