asir {msSPChelpR} | R Documentation |
Calculate age-standardized incidence rates
Description
Calculate age-standardized incidence rates
Usage
asir(
df,
dattype = NULL,
std_pop = "ESP2013",
truncate_std_pop = FALSE,
futime_src = "refpop",
summarize_groups = "none",
count_var,
stdpop_df = standard_population,
refpop_df = population,
region_var = NULL,
age_var = NULL,
sex_var = NULL,
year_var = NULL,
site_var = NULL,
futime_var = NULL,
pyar_var = NULL,
alpha = 0.05
)
Arguments
df |
dataframe in wide format |
dattype |
can be "zfkd" or "seer" or NULL. Will set default variable names if dattype is "seer" or "zfkd". Default is NULL. |
std_pop |
can be either "ESP2013, ESP1976, WHO1960, WHO2000 |
truncate_std_pop |
if TRUE standard population will be truncated for all age-groups that do not occur in df |
futime_src |
can be either "refpop" or "cohort". Default is "refpop". |
summarize_groups |
option to define summarizing stratified groups. Default is "none". If you want to define variables that should be summarized into one group, you can chose from region_var, sex_var, year_var. Define multiple summarize variables by summarize_groups = c("region", "sex", "year") |
count_var |
variable to be counted as observed case. Should be 1 for case to be counted. |
stdpop_df |
df where standard population is defined. It is assumed that stdpop_df has the columns "sex" for biological sex, "age" for age-groups, "standard_pop" for name of standard population (e.g. "European Standard Population 2013) and "population_n" for size of standard population age-group. stdpop_df must use the same category coding of age and sex as age_var and sex_var. |
refpop_df |
df where reference population data is defined. Only required if option futime = "refpop" is chosen. It is assumed that refpop_df has the columns "region" for region, "sex" for biological sex, "age" for age-groups (can be single ages or 5-year brackets), "year" for time period (can be single year or 5-year brackets), "population_pyar" for person-years at risk in the respective age/sex/year cohort. refpop_df must use the same category coding of age, sex, region, year and site as age_var, sex_var, region_var, year_var and site_var. |
region_var |
variable in df that contains information on region where case was incident. Default is set if dattype is given. |
age_var |
variable in df that contains information on age-group. Default is set if dattype is given. |
sex_var |
variable in df that contains information on biological sex. Default is set if dattype is given. |
year_var |
variable in df that contains information on year or year-period when case was incident. Default is set if dattype is given. |
site_var |
variable in df that contains information on ICD code of case diagnosis. Default is set if dattype is given. |
futime_var |
variable in df that contains follow-up time per person (in years) in cohort (can only be used with futime_src = "cohort"). Default is set if dattype is given. |
pyar_var |
variable in refpop_df that contains person-years-at-risk in reference population (can only be used with futime_src = "refpop") Default is set if dattype is given. |
alpha |
significance level for confidence interval calculations. Default is alpha = 0.05 which will give 95 percent confidence intervals. |
Value
df
Examples
#load sample data
data("us_second_cancer")
data("standard_population")
data("population_us")
#make wide data as this is the required format
usdata_wide <- us_second_cancer %>%
#only use sample
dplyr::filter(as.numeric(fake_id) < 200000) %>%
msSPChelpR::reshape_wide_tidyr(case_id_var = "fake_id",
time_id_var = "SEQ_NUM", timevar_max = 2)
#create count variable
usdata_wide <- usdata_wide %>%
dplyr::mutate(count_spc = dplyr::case_when(is.na(t_site_icd.2) ~ 1,
TRUE ~ 0))
#remove cases for which no reference population exists
usdata_wide <- usdata_wide %>%
dplyr::filter(t_yeardiag.2 %in% c("1990 - 1994", "1995 - 1999", "2000 - 2004",
"2005 - 2009", "2010 - 2014"))
#now we can run the function
msSPChelpR::asir(usdata_wide,
dattype = "seer",
std_pop = "ESP2013",
truncate_std_pop = FALSE,
futime_src = "refpop",
summarize_groups = "none",
count_var = "count_spc",
refpop_df = population_us,
region_var = "registry.1",
age_var = "fc_agegroup.1",
sex_var = "sex.1",
year_var = "t_yeardiag.2",
site_var = "t_site_icd.2",
pyar_var = "population_pyar")