cchs {cchs}  R Documentation 
cchs
fits a Cox proportionalhazards regression model to casecohort data where the subcohort was selected by stratified simple random sampling. It uses Estimator III of Borgan et al (2000).
cchs(formula, data=parent.frame(), inSubcohort, stratum, samplingFractions, cohortStratumSizes, precision=NULL, returnAdjustedTimes=FALSE, swap=TRUE, dropNeverAtRiskRows=TRUE, dropSubcohEventsDfbeta=FALSE, adjustSampFracIfAnyNAs=FALSE, keepAllCoxphElements=FALSE, confidenceLevel=0.95, verbose=FALSE, annotateErrors=TRUE, coxphControl, ...)
formula 
An object of class 
data 
A dataframe or environment that contains the variables used in the formula. The variables named in 
inSubcohort 
A vector of logical variables that shows whether each observation/row is in the subcohort ( 
stratum 
A vector that defines the strata within which the subcohort was selected. Each element of 
samplingFractions, cohortStratumSizes 

precision 
For example, if the times were recorded to the nearest day but are stored as numbers of years, then 
returnAdjustedTimes 
If this is 
swap 
If this is 
dropNeverAtRiskRows 
If this is 
dropSubcohEventsDfbeta 
If this is 
adjustSampFracIfAnyNAs 
If this is 
keepAllCoxphElements 
If this is 
confidenceLevel 
The level for the hazardratio confidence intervals (a number in the interval [0,1]). 
verbose 
If this is 
annotateErrors 
If this is 
coxphControl, ... 
These are optional arguments to control the working of 
In a case–cohort study, the dataset consists only of the cases (the participants who have an event) and the participants who are in the subcohort, which is a randomly selected subset of the cohort. In a stratified case–cohort study, the subcohort is selected by stratified simple random sampling. This means that the cohort is divided into strata, and from each stratum a proportion of the participants equal to that stratum's sampling fraction is selected to be in the subcohort (and within each stratum, each participant is selected with equal probability). For more on stratified case–cohort studies see any of the references listed below.
cchs
fits a Cox proportionalhazards regression model to data from a stratified case–cohort study, using the timefixed version of Estimator III from Borgan et al (2000). Estimators I and II from Borgan et al (2000) are available by using cch
with the options method="I.Borgan"
and method="II.Borgan"
, but only Estimator III is scoreunbiased, which is the main desirable criterion. The data must be in the usual form where each row corresponds to one observation (that is, one participant). cchs
works by manipulating the data in various ways, then passing it to coxph
(which is suitable for fitting a Cox model to data from a cohort study), and finally making corrections to the varianceestimates. It is planned that a vignette will be produced and this will contain more detail.
For normal use, the logical (boolean) arguments should have their default values. cchs
performs a completecase analysis, meaning that rows will be dropped if they contain NA
s in any of the variables that appear in the model, including inside the Surv()
, or in inSubcohort
or stratum
. NA
s are not allowed in samplingFractions
or cohortStratumSizes
, unless that vector has names and any of those names are not equal to values of stratum
, in which case the corresponding elements can be NA
.
cchs
does not normally give replicable results, because the swapping and the small changes to tied eventtimes are random (see swap
and precision
in the Arguments section). To get exactly the same results every time, use set.seed
with a fixed seed just before calling cchs
.
For more information about cchs
see the article in R Journal, Jones (2018).
An S3 object of class cchs
. This is a list that contains the following elements:
coefficients 
The vector of coefficients. 
var 
The variance matrix of the coefficients. 
loglik 
A vector of two elements: the first is the loglikelihood with the initial values of the coefficients that were used in the iteration to find the maximum likelihood, and the second is the maximized loglikelihood—that is, the loglikelihood with the final values of the coefficients. (Strictly speaking these should all say “pseudolikelihood” instead of “likelihood”.) 
iter 
The number of iterations used by 
n 
The number of observations (that is, rows), that were used in the call to 
nevent 
The number of events (also called failures). 
call 
The call that was used to create the 
coeffsTable 
A summary of the main output. This is a matrix that contains the hazard ratios, confidence intervals for them, pvalues for the Wald tests, log hazard ratios (which are the coefficients in the Cox model), and standard errors of the log hazard ratios. 
confidenceLevel 
The level for the confidence intervals in 
nEachStatus 
A vector with three elements: the numbers of subcohort noncases, subcohort cases, and nonsubcohort cases. The sum of these is 
nStrata 
The number of strata that appear in the data. 
message 
A message about observations that have been dropped because of NAs and eventtimes that have been changed to deal with ties, if either of these happened. 
If keepAllCoxphElements
is TRUE
, then the cchs
object will also contain the other elements listed under coxph.object
. If returnAdjustedTimes
is TRUE
, then it will contain an adjustedTimes
element, which is a vector of the adjusted exittimes (with elements in the same order as the observations/rows in the data).
Borgan, Ø., Langholz, B., Samuelsen S.O., Goldstein, L., Pogoda, J. (2000). Exposure stratified case–cohort designs. Lifetime Data Analysis 6 (1), 39–58. (link)
Cologne, J., Preston, D.L., Imai, K., Misumi, M., Yoshida, K., Hayashi, T., Nakachi, K. (2012). Conventional case–cohort design and analysis for studies of interaction. International Journal of Epidemiology 41 (4), 1174–1186. (link)
Jones, E. (2018). cchs: An R package for stratified case–cohort studies. R Journal 10 (1), 484–494. (link)
Langholz, B., Jiao, J. (2007). Computational methods for case–cohort studies. Computational Statistics and Data Analysis 51 (8), 3737–3748. (link)
cch
, which can calculate Estimators I and II from Borgan et al (2000), coxph
, which cchs
uses internally, and coxph.control
, a container for certain parameters that are passed to coxph
. These are all in the survival package.
cchsData
, an example dataset that cchs
can be used on.
# Analyze the relation between survival and three covariates in cchsData. # The times are stored as numbers of days, so precision has to be 1. The # selection of the subcohort was stratified according to two strata, defined # by cchsData$localHistol, and the sampling fractions are stored in # cchsData$sampFrac. cchs(Surv(time, isCase) ~ stage + centralLabHistol + ageAtDiagnosis, data=cchsData, inSubcohort=inSubcohort, stratum=localHistol, samplingFractions=sampFrac, precision=1) # Do the same analysis using cohortStratumSizes instead of samplingFractions. # For the value of cohortStratumSizes see the Details section of ?cchsData. # These two calls to cchs will give slightly different results unless set.seed # is used with the same seed just before both of them. cchs(Surv(time, isCase) ~ stage + centralLabHistol + ageAtDiagnosis, data=cchsData, inSubcohort=inSubcohort, stratum=localHistol, cohortStratumSizes=c(favorable=3622, unfavorable=406), precision=1)