samplesizelogisticcasecontrol-package {samplesizelogisticcasecontrol} | R Documentation |
Sample size and power calculations for case-control studies
Description
This package can be used to calculate the required sample size needed for
case-control studies to have sufficient power, or calculate the power of a
case-control study for a given sample size.
To calculate the sample size, one needs to
specify the significance level \alpha
, power \gamma
, and the
hypothesized non-null \theta
. Here \theta
is a log odds ratio
for an exposure main effect or \theta
is an interaction effect on the
logistic scale.
Choosing \theta
requires subject matter knowledge to understand how
strong the association needs to be to have practical importance.
Sample size varies inversely with
\theta^{2}
and is thus highly dependent on \theta
.
Details
The main functions in the package are for different types of exposure variables, where the
exposure variable is the variable of interest in a hypothesis test.
The functions sampleSize_binary
and power_binary
can be used for a binary exposure variable (X = 0 or 1),
while the functions sampleSize_ordinal
and power_ordinal
is a more general function that can be used for
an ordinal exposure variable (X takes the values 0, 1, ..., k).
sampleSize_continuous
and power_continuous
are useful for a continuous exposure variable and
sampleSize_data
and power_data
can be used when pilot data is available that defines
the distribution of the exposure and other confounding variables. Each function will return the
sample sizes or power for a Wald-type test and a score test. When there are no adjustments for confounders, the user can
specify a general distribution for the exposure variable. With confounders, either pilot data or a function to
generate random samples from the multivariate distribution of the confounders and exposure variable must
be given.
If the parameter of interest, \theta
,
is one dimensional, then the test statistic is often asymptotically equivalent
to a test of the form
T > Z_{1-\alpha}\sigma_{0}n^{-\frac{1}{2}}
or
T > Z_{1-\alpha}\sigma_{\theta}n^{-\frac{1}{2}}
, where
Z_{1-\alpha}
is the 1-\alpha
quantile of a standard
normal distribution, n
is the total sample size (cases plus controls),
and n^{\frac{1}{2}}T
is
normally distributed with mean 0 and null variance \sigma_{0}^{2}
.
Depending on which critical value
Z_{1-\alpha}\sigma_{0}n^{-\frac{1}{2}}
or
Z_{1-\alpha}\sigma_{\theta}n^{-\frac{1}{2}}
of the test was used, the formulas for sample size are obtained by inverting the
equations for power:
n_{1} = (Z_{\gamma}\sigma_{\theta} + Z_{1-\alpha}\sigma_{0})^{2}/\theta^{2}
or
n_{2} = (Z_{\gamma} + Z_{1-\alpha})^{2}\sigma_{\theta}^{2}/\theta^{2}
.
Author(s)
Mitchell H. Gail <gailm@mail.nih.gov>
References
Gail, M.H. and Haneuse, S. Power and sample size for case-control studies. In Handbook of Statistical Methods for Case-Control Studies. Editors: Ornulf Borgan, Norman Breslow, Nilanjan Chatterjee, Mitchell Gail, Alastair Scott, Christopher Wild. Chapman and Hall/CRC, Taylor and Francis Group, New York, 2018, pages 163-187.
Gail, M. H and Haneuse, S. Power and sample size for multivariate logistic modeling of unmatched case-control studies.
Statistical Methods in Medical Research. 2019;28(3):822-834,
https://doi.org/10.1177/0962280217737157