ssizeCT {powerSurvEpi} | R Documentation |
Sample Size Calculation in the Analysis of Survival Data for Clinical Trials
Description
Sample size calculation for the Comparison of Survival Curves Between Two Groups under the Cox Proportional-Hazards Model for clinical trials. Some parameters will be estimated based on a pilot data set.
Usage
ssizeCT(formula,
dat,
power,
k,
RR,
alpha = 0.05)
Arguments
formula |
A formula object, e.g. |
dat |
a data frame representing the pilot data set and containing at least 3 columns: (1) survival/censoring time; (2) censoring indicator;
(3) group indicator which is a factor object in R and takes only two possible values ( |
power |
numeric. power to detect the magnitude of the hazard ratio as small as that specified by |
k |
numeric. ratio of participants in group E (experimental group) compared to group C (control group). |
RR |
numeric. postulated hazard ratio. |
alpha |
numeric. type I error rate. |
Details
This is an implementation of the sample size calculation method described in Section 14.12 (page 807) of Rosner (2006). The method was proposed by Freedman (1982).
The movitation of this function is that some times we do not have information about m
or p_E
and p_C
available, but we have a pilot data set that can be used to estimate p_E
and p_C
hence
m
, where m=n_E p_E + n_C p_C
is the expected total number of events over both groups, n_E
and n_C
are numbers of participants in group E (experimental group) and group C (control group), respectively.
p_E
is the probability of failure in group E (experimental group) over the maximum time period of the study (t years). p_C
is the probability of failure in group C (control group) over the maximum time period of the study
(t years).
Suppose we want to compare the survival curves between an experimental group (E
) and
a control group (C
) in a clinical trial with a maximum follow-up of t
years.
The Cox proportional hazards regression model is assumed to have the form:
h(t|X_1)=h_0(t)\exp(\beta_1 X_1).
Let n_E
be the number of participants in the E
group
and n_C
be the number of participants in the C
group.
We wish to test the hypothesis H0: RR=1
versus H1: RR
not equal to 1,
where RR=\exp(\beta_1)=
underlying hazard ratio
for the E
group versus the C
group. Let RR
be the postulated hazard ratio,
\alpha
be the significance level. Assume that the test is a two-sided test.
If the ratio of participants in group
E compared to group C = n_E/n_C=k
, then the number of participants needed in each group to
achieve a power of 1-\beta
is
n_E=\frac{m k}{k p_E + p_C}, n_C=\frac{m}{k p_E + p_C}
where
m=\frac{1}{k}\left(\frac{k RR + 1}{RR - 1}\right)^2\left(
z_{1-\alpha/2}+z_{1-\beta}
\right)^2,
and z_{1-\alpha/2}
is the 100 (1-\alpha/2)
-th percentile of
the standard normal distribution N(0, 1)
.
p_C
and p_E
can be calculated from the following formulaes:
p_C=\sum_{i=1}^{t}D_i, p_E=\sum_{i=1}^{t}E_i,
where D_i=\lambda_i A_i C_i
, E_i=RR\lambda_i B_i C_i
,
A_i=\prod_{j=0}^{i-1}(1-\lambda_j)
, B_i=\prod_{j=0}^{i-1}(1-RR\lambda_j)
,
C_i=\prod_{j=0}^{i-1}(1-\delta_j)
. And
\lambda_i
is the probability of failure at time i
among participants in the
control group, given that a participant has survived to time i-1
and is not censored at time i-1
,
i.e., the approximate hazard time i
in the control group, i=1,...,t
;
RRlambda_i
is the probability of failure at time i
among participants in the
experimental group, given that a participant has survived to time i-1
and is not censored at time i-1
,
i.e., the approximate hazard time i
in the experimental group, i=1,...,t
;
delta
is the prbability that a participant is censored at time i
given that he was
followed up to time i
and has not failed, i=0, 1, ..., t
, which is assumed the same in each group.
Value
mat.lambda |
a matrix with 9 columns and |
mat.event |
a matrix with 5 columns and |
pC |
estimated probability of failure in group C (control group) over the maximum time period of the study (t years). |
pE |
estimated probability of failure in group E (experimental group) over the maximum time period of the study (t years). |
ssize |
a two-element vector. The first element is |
Note
(1) The estimates of RRlambda_i=RR*\lambda_i
. That is, RRlambda
is not directly estimated based on data
from the experimental group;
(2) The sample size formula assumes that the central-limit theorem is valid and hence is appropriate for large samples.
(3) n_E
and n_C
will be rounded up to integers.
References
Freedman, L.S. (1982). Tables of the number of patients required in clinical trials using the log-rank test. Statistics in Medicine. 1: 121-129
Rosner B. (2006). Fundamentals of Biostatistics. (6-th edition). Thomson Brooks/Cole.
See Also
Examples
# Example 14.42 in Rosner B. Fundamentals of Biostatistics.
# (6-th edition). (2006) page 809
library(survival)
data(Oph)
res <- ssizeCT(formula = Surv(times, status) ~ group,
dat = Oph,
power = 0.8,
k = 1,
RR = 0.7,
alpha = 0.05)
# Table 14.24 on page 809 of Rosner (2006)
print(round(res$mat.lambda, 4))
# Table 14.12 on page 787 of Rosner (2006)
print(round(res$mat.event, 4))
# the sample size
print(res$ssize)