survtab_ag {popEpi} | R Documentation |
Estimate Survival Time Functions
Description
This function estimates survival time functions: survival, relative/net survival, and crude/absolute risk functions (CIF).
Usage
survtab_ag(
formula = NULL,
data,
adjust = NULL,
weights = NULL,
surv.breaks = NULL,
n = "at.risk",
d = "from0to1",
n.cens = "from0to0",
pyrs = "pyrs",
d.exp = "d.exp",
n.pp = NULL,
d.pp = "d.pp",
d.pp.2 = "d.pp.2",
n.cens.pp = "n.cens.pp",
pyrs.pp = "pyrs.pp",
d.exp.pp = "d.exp.pp",
surv.type = "surv.rel",
surv.method = "hazard",
relsurv.method = "e2",
subset = NULL,
conf.level = 0.95,
conf.type = "log-log",
verbose = FALSE
)
Arguments
formula |
a |
data |
since popEpi 0.4.0, a |
adjust |
can be used as an alternative to passing variables to
argument |
weights |
typically a list of weights or a |
surv.breaks |
a vector of breaks on the
survival time scale. Optional if |
n |
variable containing counts of subjects at-risk at the start of a
time interval; e.g. |
d |
variable(s) containing counts of subjects experiencing an event.
With only one type of event, e.g. |
n.cens |
variable containing counts of subjects censored during a
survival time interval; E.g. |
pyrs |
variable containing total subject-time accumulated within a
survival time interval; E.g. |
d.exp |
variable denoting total "expected numbers of events"
(typically computed |
n.pp |
variable containing total Pohar-Perme weighted counts of
subjects at risk in an interval,
supplied as argument |
d.pp |
variable(s) containing Pohar-Perme weighted counts of events,
supplied as argument |
d.pp.2 |
variable(s) containing total Pohar-Perme
"double-weighted" counts of events,
supplied as argument |
n.cens.pp |
variable containing total Pohar-Perme weighted counts
censorings,
supplied as argument |
pyrs.pp |
variable containing total Pohar-Perme weighted subject-times,
supplied as argument |
d.exp.pp |
variable containing total Pohar-Perme weighted counts
of excess events,
supplied as argument |
surv.type |
one of |
surv.method |
either |
relsurv.method |
either |
subset |
a logical condition; e.g. |
conf.level |
confidence level used in confidence intervals;
e.g. |
conf.type |
character string; must be one of |
verbose |
logical; if |
Value
Returns a table of life time function values and other information with survival intervals as rows. Returns some of the following estimates of survival time functions:
-
surv.obs
- observed (raw, overall) survival -
surv.obs.K
- observed cause-specific survival for cause K -
CIF_k
- cumulative incidence function for causek
-
CIF.rel
- cumulative incidence function using excess cases -
r.e2
- relative survival, EdererII -
r.pp
- relative survival, Pohar-Perme weighted
The suffix .as
implies adjusted estimates, and .lo
and
.hi
imply lower and upper confidence limits, respectively.
The prefix SE.
stands for standard error.
Basics
This function computes interval-based estimates of survival time functions, where the intervals are set by the user. For product-limit-based estimation see packages survival and relsurv.
if surv.type = 'surv.obs'
, only 'raw' observed survival
is estimated over the chosen time intervals. With
surv.type = 'surv.rel'
, also relative survival estimates
are supplied in addition to observed survival figures.
surv.type = 'cif.obs'
requests cumulative incidence functions (CIF)
to be estimated.
CIFs are estimated for each competing risk based
on a survival-interval-specific proportional hazards
assumption as described by Chiang (1968).
With surv.type = 'cif.rel'
, a CIF is estimated with using
excess cases as the ”cause-specific” cases. Finally, with
surv.type = 'surv.cause'
, cause-specific survivals are
estimated separately for each separate type of event.
In hazard-based estimation (surv.method = "hazard"
) survival
time functions are transformations of the estimated corresponding hazard
in the intervals. The hazard itself is estimated using counts of events
(or excess events) and total subject-time in the interval. Life table
surv.method = "lifetable"
estimates are constructed as transformations
of probabilities computed using counts of events and counts of subjects
at risk.
The vignette survtab_examples has some practical examples.
Relative survival
When surv.type = 'surv.rel'
, the user can choose
relsurv.method = 'pp'
, whereupon Pohar-Perme weighting is used.
By default relsurv.method = 'e2'
, i.e. the Ederer II method
is used to estimate relative survival.
Adjusted estimates
Adjusted estimates in this context mean computing estimates separately by the levels of adjusting variables and returning weighted averages of the estimates. For example, computing estimates separately by age groups and returning a weighted average estimate (age-adjusted estimate).
Adjusting requires specification of both the adjusting variables and
the weights for all the levels of the adjusting variables. The former can be
accomplished by using adjust()
with the argument formula
,
or by supplying variables directly to argument adjust
. E.g. the
following are all equivalent:
formula = fot ~ sex + adjust(agegr) + adjust(area)
formula = fot ~ sex + adjust(agegr, area)
formula = fot ~ sex, adjust = c("agegr", "area")
formula = fot ~ sex, adjust = list(agegr, area)
The adjusting variables must match with the variable names in the
argument weights
;
see the dedicated help page.
Typically weights are supplied as a list
or
a data.frame
. The former can be done by e.g.
weights = list(agegr = VEC1, area = VEC2)
,
where VEC1
and VEC2
are vectors of weights (which do not
have to add up to one). See
survtab_examples
for an example of using a data.frame
to pass weights.
Period analysis and other data selection schemes
To calculate e.g. period analysis (delayed entry) estimates, limit the data when/before supplying to this function.See survtab_examples.
Data requirements
survtab_ag
computes estimates of survival time functions using
pre-aggregated data. For using subject-level data directly, use
survtab
. For aggregating data, see lexpand
and aggre
.
By default, and if data is an aggre
object (not mandatory),
survtab_ag
makes use of the exact same breaks that were used in
splitting the original data (with e.g. lexpand
), so it is not
necessary to specify any surv.breaks
. If specified, the
surv.breaks
must be a subset of the pertinent
pre-existing breaks. When data is not an aggre
object, breaks
must always be specified. Interval lengths (delta
in output) are
also calculated based on whichever breaks are used,
so the upper limit of the breaks should
therefore be meaningful and never e.g. Inf
.
References
Perme, Maja Pohar, Janez Stare, and Jacques Esteve. "On estimation in relative survival." Biometrics 68.1 (2012): 113-120. doi:10.1111/j.1541-0420.2011.01640.x
Hakulinen, Timo, Karri Seppa, and Paul C. Lambert. "Choosing the relative survival method for cancer survival estimation." European Journal of Cancer 47.14 (2011): 2202-2210. doi:10.1016/j.ejca.2011.03.011
Seppa, Karri, Timo Hakulinen, and Arun Pokhrel. "Choosing the net survival method for cancer survival estimation." European Journal of Cancer (2013). doi:10.1016/j.ejca.2013.09.019
CHIANG, Chin Long. Introduction to stochastic processes in biostatistics. 1968. ISBN-14: 978-0471155003
Seppa K., Dyba T. and Hakulinen T.: Cancer Survival, Reference Module in Biomedical Sciences. Elsevier. 08-Jan-2015. doi:10.1016/B978-0-12-801238-3.02745-8
See Also
splitMulti
, lexpand
,
ICSS
, sire
The survtab_examples vignette
Other main functions:
Surv()
,
rate()
,
relpois()
,
relpois_ag()
,
sir()
,
sirspline()
,
survmean()
,
survtab()
Other survtab functions:
Surv()
,
lines.survtab()
,
plot.survtab()
,
print.survtab()
,
summary.survtab()
,
survtab()
Examples
## see more examples with explanations in vignette("survtab_examples")
#### survtab_ag usage
data("sire", package = "popEpi")
## prepare data for e.g. 5-year "period analysis" for 2008-2012
## note: sire is a simulated cohort integrated into popEpi.
BL <- list(fot=seq(0, 5, by = 1/12),
per = c("2008-01-01", "2013-01-01"))
x <- lexpand(sire, birth = bi_date, entry = dg_date, exit = ex_date,
status = status %in% 1:2,
breaks = BL,
pophaz = popmort,
aggre = list(fot))
## calculate relative EdererII period method
## NOTE: x is an aggre object here, so surv.breaks are deduced
## automatically
st <- survtab_ag(fot ~ 1, data = x)
summary(st, t = 1:5) ## annual estimates
summary(st, q = list(r.e2 = 0.75)) ## 1st interval where r.e2 < 0.75 at end
plot(st)
## non-aggre data: first call to survtab_ag would fail
df <- data.frame(x)
# st <- survtab_ag(fot ~ 1, data = x)
st <- survtab_ag(fot ~ 1, data = x, surv.breaks = BL$fot)
## calculate age-standardised 5-year relative survival ratio using
## Ederer II method and period approach
sire$agegr <- cut(sire$dg_age,c(0,45,55,65,75,Inf),right=FALSE)
BL <- list(fot=seq(0, 5, by = 1/12),
per = c("2008-01-01", "2013-01-01"))
x <- lexpand(sire, birth = bi_date, entry = dg_date, exit = ex_date,
status = status %in% 1:2,
breaks = BL,
pophaz = popmort,
aggre = list(agegr, fot))
## age standardisation using internal weights (age distribution of
## patients diagnosed within the period window)
## (NOTE: what is done here is equivalent to using weights = "internal")
w <- aggregate(at.risk ~ agegr, data = x[x$fot == 0], FUN = sum)
names(w) <- c("agegr", "weights")
st <- survtab_ag(fot ~ adjust(agegr), data = x, weights = w)
plot(st, y = "r.e2.as", col = c("blue"))
## age standardisation using ICSS1 weights
data(ICSS)
cut <- c(0, 45, 55, 65, 75, Inf)
agegr <- cut(ICSS$age, cut, right = FALSE)
w <- aggregate(ICSS1~agegr, data = ICSS, FUN = sum)
names(w) <- c("agegr", "weights")
st <- survtab_ag(fot ~ adjust(agegr), data = x, weights = w)
lines(st, y = "r.e2.as", col = c("red"))
## cause-specific survival
sire$stat <- factor(sire$status, 0:2, c("alive", "canD", "othD"))
x <- lexpand(sire, birth = bi_date, entry = dg_date, exit = ex_date,
status = stat,
breaks = BL,
pophaz = popmort,
aggre = list(agegr, fot))
st <- survtab_ag(fot ~ adjust(agegr), data = x, weights = w,
d = c("fromalivetocanD", "fromalivetoothD"),
surv.type = "surv.cause")
plot(st, y = "surv.obs.fromalivetocanD.as")
lines(st, y = "surv.obs.fromalivetoothD.as", col = "red")