R: variance.missingdata

variance.missingdata {CaseCohortCoxSurvival}

R Documentation

variance.missingdata

Description

Computes the variance estimate that follows the complete variance decomposition, for a parameter such as log-relative hazard, cumulative baseline hazard or covariate specific pure-risk, when covariate information is missing for individuals in the phase-two sample.

Usage

variance.missingdata(n, casecohort, casecohort.phase2, weights, 
weights.phase2, weights.p2.phase2, infl2, infl3, stratified.p2 = NULL, 
estimated.weights = NULL)

Arguments

`n`	number of individuals in the whole cohort.
`casecohort`	If `stratified = TRUE`, data frame with `W` (the `J` phase-two strata), `strata.m` (vector of length `J` with the numbers of sampled individuals in the strata in the second phase of sampling) and `strata.n` (vector of length `J` with the strata sizes in the cohort), for each individual in the stratified case cohort data. If `stratified = FALSE`, data frame with `m` (number of sampled individuals in the second phase of sampling) and `n` (cohort size), for each individual in the unstratified case cohort data.
`casecohort.phase2`	If `stratified = TRUE`, data frame with `W` (the `J` phase-two strata), `strata.m` (vector of length `J` with the numbers of sampled individuals in the strata in the second phase of sampling), `strata.n` (vector of length `J` with the strata sizes in the cohort) and `phase3` (phase-three sampling indicator), for each individual in the phase-two sample. If `stratified = FALSE`, data frame with `m` (number of sampled individuals in the second phase of sampling), `n` (cohort size) and unstrat.phase3 (phase-three sampling indicator), for each individual in the phase-two sample.
`weights`	vector with design weights for the individuals in the case cohort data.
`weights.phase2`	vector with design weights for the individuals in the phase-two sample.
`weights.p2.phase2`	vector with phase-two design weights for the individuals in the phase-two sample.
`infl2`	matrix with the phase-two influences on the parameter.
`infl3`	matrix with the phase-three influences on the parameter.
`stratified.p2`	was the second phase of sampling stratified on `W`? Default is `FALSE`.
`estimated.weights`	were the phase-three weights estimated? Default is `FALSE`.

Details

variance.missingdata works for estimation from a case cohort with design weights and when covariate information was missing for certain individuals in the phase-two data (i.e., case cohort obtained from three phases of sampling and consisting of individuals in the phase-two data without missing covariate information).

If there are no missing covariates in the phase- two sample, use variance with either design weights or calibrated weights.

variance.missingdata uses the variance formulas provided in Etievant and Gail (2023). More precisely, as in Section 5.4 if estimated.weights = TRUE, and as in Web Appendix H.2 if estimated.weights = FALSE.

Value

variance: variance estimate.

References

Etievant, L., Gail, M.H. (2023). Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data. Submitted.

Examples


data(dataexample.missingdata, package="CaseCohortCoxSurvival")

cohort            <- dataexample.missingdata$cohort # a simulated cohort
n                 <- nrow(cohort)
casecohort        <- dataexample.missingdata$casecohort # a simulated stratified case cohort
casecohort.phase2 <- dataexample.missingdata$casecohort.phase2 
riskmat.phase2    <- dataexample.missingdata$riskmat.phase2
dNt.phase2        <- dataexample.missingdata$dNt.phase2
B.phase2          <- dataexample.missingdata$B.phase2

Tau1    <- 0 # given time interval for the pure risk
Tau2    <- 8
x       <- c(-1, 1, -0.6) # given covariate profile for the pure risk

# Estimation using the stratified case cohort with true known design weights

mod <- coxph(Surv(times, status) ~ X1 + X2 + X3, data = casecohort, 
             weight = weights.true, id = id, robust = TRUE)

estimation <- influences.missingdata(mod = mod, riskmat.phase2 = riskmat.phase2, 
                                     dNt.phase2 = dNt.phase2, Tau1 = Tau1, 
                                     Tau2 = Tau2, x = x)
infl.beta     <- estimation$infl.beta
infl.Lambda0  <- estimation$infl.Lambda0.Tau1Tau2
infl.Pi.x    <- estimation$infl.Pi.x.Tau1Tau2
infl2.beta    <- estimation$infl2.beta
infl2.Lambda0 <- estimation$infl2.Lambda0.Tau1Tau2
infl2.Pi.x   <- estimation$infl2.Pi.x.Tau1Tau2
infl3.beta    <- estimation$infl3.beta
infl3.Lambda0 <- estimation$infl3.Lambda0.Tau1Tau2
infl3.Pi.x   <- estimation$infl3.Pi.x.Tau1Tau2

# variance estimate for the log-relative hazard
variance.missingdata(n = n, casecohort = casecohort, 
                                 casecohort.phase2 = casecohort.phase2, 
                                 weights = casecohort$weights.true, 
                                 weights.phase2 = casecohort.phase2$weights.true, 
                                 weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                                 infl2 = infl2.beta, infl3 = infl3.beta, 
                                 stratified.p2 = TRUE)

# variance estimate for the cumulative baseline hazard estimate
variance.missingdata(n = n, casecohort = casecohort, 
                     casecohort.phase2 = casecohort.phase2, 
                     weights = casecohort$weights.true, 
                     weights.phase2 = casecohort.phase2$weights.true,
                     weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                     infl2 = infl2.Lambda0, infl3 = infl3.Lambda0, 
                     stratified.p2 = TRUE)

# variance estimate for the pure risk estimate
variance.missingdata(n = n, casecohort = casecohort, 
                     casecohort.phase2 = casecohort.phase2, 
                     weights = casecohort$weights.true, 
                     weights.phase2 = casecohort.phase2$weights.true, 
                     weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                     infl2 = infl2.Pi.x, infl3 = infl3.Pi.x, 
                     stratified.p2 = TRUE)


# Estimation using the stratified case cohort with estimated weights, and
# accounting for the estimation through the influences 

mod.est <- coxph(Surv(times, status) ~ X1 + X2 + X3, data = casecohort, 
                 weight = weights.est, id = id, robust = TRUE)

estimation.est  <- influences.missingdata(mod.est, 
                                          riskmat.phase2 = riskmat.phase2, 
                                          dNt.phase2 = dNt.phase2, 
                                          estimated.weights = TRUE,
                                          B.phase2 = B.phase2, Tau1 = Tau1, 
                                          Tau2 = Tau2, x = x)
infl.beta.est     <- estimation.est$infl.beta
infl.Lambda0.est  <- estimation.est$infl.Lambda0.Tau1Tau2
infl.Pi.x.est     <- estimation.est$infl.Pi.x.Tau1Tau2
infl2.beta.est    <- estimation.est$infl2.beta
infl2.Lambda0.est <- estimation.est$infl2.Lambda0.Tau1Tau2
infl2.Pi.x.est    <- estimation.est$infl2.Pi.x.Tau1Tau2
infl3.beta.est    <- estimation.est$infl3.beta
infl3.Lambda0.est <- estimation.est$infl3.Lambda0.Tau1Tau2
infl3.Pi.x.est    <- estimation.est$infl3.Pi.x.Tau1Tau2

# variance estimate for the log-relative hazard
variance.missingdata(n = n, casecohort = casecohort, 
                     casecohort.phase2 = casecohort.phase2, 
                     weights = casecohort$weights.est, 
                     weights.phase2 = casecohort.phase2$weights.est, 
                     weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                     infl2 = infl2.beta.est, infl3 = infl3.beta.est, 
                     stratified.p2 = TRUE, estimated.weights = TRUE)

# variance estimate for the cumulative baseline hazard estimate
variance.missingdata(n = n, casecohort = casecohort,
                     casecohort.phase2 = casecohort.phase2, 
                     weights = casecohort$weights.est, 
                     weights.phase2 = casecohort.phase2$weights.est, 
                     weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                     infl2 = infl2.Lambda0.est, infl3 = infl3.Lambda0.est, 
                     stratified.p2 = TRUE, estimated.weights = TRUE)

# variance estimate for the pure risk estimate
variance.missingdata(n = n, casecohort = casecohort, 
                     casecohort.phase2 = casecohort.phase2, 
                     weights = casecohort$weights.est, 
                     weights.phase2 = casecohort.phase2$weights.est, 
                     weights.p2.phase2 = casecohort.phase2$weights.p2.true,
                     infl2 = infl2.Pi.x.est, infl3 = infl3.Pi.x.est, 
                     stratified.p2 = TRUE, estimated.weights = TRUE)

[Package CaseCohortCoxSurvival version 0.0.34 Index]