gestMultiple {gesttools} | R Documentation |
G-Estimation for a Time-Varying Outcome
Description
Performs g-estimation of a structural nested mean model (SNMM), based on the outcome regression methods described in Sjolander and Vansteelandt (2016) and Dukes and Vansteelandt (2018). We assume a dataset with a time-varying outcome that is either binary or continuous, time-varying and/or baseline confounders, and a time-varying exposure that is either binary, continuous or categorical.
Usage
gestMultiple(
data,
idvar,
timevar,
Yn,
An,
Cn = NA,
outcomemodels,
propensitymodel,
censoringmodel = NULL,
type,
EfmVar = NA,
cutoff = NA,
...
)
Arguments
data |
A data frame in long format containing the data to be analysed. See description for details. |
idvar |
Character string specifying the name of the ID variable in data. |
timevar |
Character string specifying the name of the time variable in the data. Note that timevar must specify time periods as integer values starting from 1 (must not begin at 0). |
Yn |
Character string specifying the name of the time-varying outcome variable. |
An |
Character string specifying the name of the time-varying exposure variable. |
Cn |
Optional character string specifying the name of the censoring indicator variable. The variable specified in Cn should be a numeric vector taking values 0 or 1, with 1 indicating censored. |
outcomemodels |
a list of formulas or formula objects specifying the outcome models for Yn prior to adjustment by propensity score. The i'th entry of the list specifies the outcome model for the i step counterfactuals. See description for details. |
propensitymodel |
A formula or formula object specifying the propensity score model for An. |
censoringmodel |
A formula or formula object specifying the censoring model for Cn. |
type |
Value from 1-4 specifying SNMM type to fit. See details. |
EfmVar |
Character string specifying the name of the effect modifying variable for types 2 or 4. |
cutoff |
An integer taking value from 1 up to T, where T is the maximum value of timevar.
Instructs the function to estimate causal effects based only on exposures up to |
... |
Additional arguments, currently not in use. |
Details
Suppose a series of time periods 1,\ldots,T+1
whereby a time-varying exposure and confounder (A_t
and L_t
) are measured over times t=1,\ldots,T
and
a time varying outcome Y_s
is measured over times s=2,\ldots,T+1
. Define c=s-t
as the step length, that is the number of time periods separating an exposure measurement, and subsequent outcome measurement.
By using the transform t=s-c
, gestmult
estimates the causal parameters \psi
of a SNMM of the form
E\{Y_s(\bar{a}_{s-c},0)-Y_s(\bar{a}_{s-c-1},0)|\bar{a}_{s-c-1},\bar{l}_{s-c}\}=\psi z_{sc}a_{s-c} \; \forall c=1,\ldots,T\; and\; \forall s>c
if Y is continuous or
\frac{E(Y_s(\bar{a}_{s-c},0)|\bar{a}_{s-c-1},\bar{l}_{s-c})}{E(Y_s(\bar{a}_{s-c-1},0)|\bar{a}_{s-c-1},\bar{l}_{s-c})}=exp(\psi z_{sc}a_{s-c}) \; \forall c=1,\ldots,T\; and \; \forall s>c
if Y is binary. The SNMMs form is defined by the parameter z_{sc}
, which can be controlled by the input type
as follows
type=1
setsz_{sc}=1
. This implies that\psi
is now the effect of exposure at any time t on all subsequent outcome periods.type=2
setsz_{sc}=c(1,l_{s-c})
and adds affect modification by the variable named inEfmVar
, which we denotel_t
. Now\psi=c(\psi_0,\psi_1)
where\psi_0
is the effect of exposure at any time t on all subsequent outcome periods, whenl_{s-c}=0
at all times t, modified by\psi_1
for each unit increase inl_{s-c}
at all times t. Note that effect modification is currently only supported for binary or continuous confounders.type=3
can posit a time-varying causal effect for each value ofc
, that is the causal effect for the exposure on outcomec
time periods later. We setz_{sc}
to a vector of zeros of length T with a 1 in thec=s-t
'th position. Now\psi=c(\psi_{1},\ldots,\psi_{T})
where\psi_(c)
is the effect of exposure on outcomec
time periods later for all outcome periodss>c
that isA_{s-c}
onY_s
.type=4
allows for a time-varying causal effect that can be modified byEfmVar
, denotedl_t
, that is it allows for both time-varying effects and effect modification. It setsz_{sc}
to a vector of zeros of length T withc(1,l_{s-c})
in thec=s-t
'th position. Now\psi=(\underline{\psi_1},\ldots,\underline{\psi_T})
where\underline{\psi_c}=c(\psi_{0c},\psi_{1c})
. Here\psi_{0c}
is the effect of exposure on outcomec
time periods later, givenl_{s-c}=0
for alls>c
, modified by\psi_{1c}
for each unit increase inl_{s-c}
for alls>c
. Note that effect modification is currently only supported for binary or continuous confounders.
The data must be in long format, where we assume the convention that each row with time=t
contains A_t,L_t
and C_{t+1},Y_{t+1}
. That is the censoring indicator for each row
should indicate that a user is censored AFTER time t and the outcome indicates the first outcome that occurs AFTER A_t
and L_t
are measured.
For example, data at time 1, should contain A_1
, L_1
, Y_{2}
, and optionally C_2
. If either A or Y are binary, they must be written as numeric vectors taking values either 0 or 1.
The same is true for any covariate that is used for effect modification.
The data must be rectangular with a row entry for every individual for each exposure time 1 up to T. Data rows after censoring should be empty apart from the ID and time variables. This can be done using the function FormatData
.
The input outcomemodels should be a list with T elements (the number of exposure times), where element i describes the outcome model for up to the i step counterfactual outcomes, that is the model is fitted to all counterfactuals up to Y_{s-i}
.
Value
List of the fitted causal parameters of the posited SNMM. These are labeled as follows for each SNMM type, where An is set to the name of the exposure variable, i is the current value of c, and EfmVar is the effect modifying variable.
type=1 |
An: The effect of exposure at any time t on outcome at all subsequent times. |
type=2 |
An: The effect of exposure on outcome at any time t, when EfmVar is set to zero, on all subsequent outcome times. |
type=3 |
c=i.An: The effect of exposure at any time t on outcome |
type=4 |
c=i.An: The effect of exposure at any time t on outcome |
The function also returns a summary of the propensity scores and censoring scores via PropensitySummary
and CensoringSummary
,
along with Data
, holding the original dataset with the propensity and censoring scores as a tibble dataset.
References
Vansteelandt, S., & Sjolander, A. (2016). Revisiting g-estimation of the Effect of a Time-varying Exposure Subject to Time-varying Confounding, Epidemiologic Methods, 5(1), 37-56. <doi:10.1515/em-2015-0005>.
Dukes, O., & Vansteelandt, S. (2018). A Note on g-Estimation of Causal Risk Ratios, American Journal of Epidemiology, 187(5), 1079–1084. <doi:10.1093/aje/kwx347>.
Examples
datas <- dataexamples(n = 1000, seed = 123, Censoring = FALSE)
data <- datas$datagestmult
data <- FormatData(
data = data, idvar = "id", timevar = "time", An = "A",
varying = c("Y", "A", "L"), GenerateHistory = TRUE, GenerateHistoryMax = 1
)
idvar <- "id"
timevar <- "time"
Yn <- "Y"
An <- "A"
Cn <- NA
outcomemodels <- list("Y~A+L+U+Lag1A", "Y~A+L+U+Lag1A", "Y~A+L+U")
propensitymodel <- c("A~L+U+as.factor(time)+Lag1A")
censoringmodel <- NULL
EfmVar <- NA
gestMultiple(data, idvar, timevar, Yn, An, Cn, outcomemodels, propensitymodel,
censoringmodel = NULL, type = 1, EfmVar,
cutoff = NA
)
# Example with censoring
datas <- dataexamples(n = 1000, seed = 123, Censoring = TRUE)
data <- datas$datagestmult
data <- FormatData(
data = data, idvar = "id", timevar = "time", An = "A", Cn = "C",
varying = c("Y", "A", "L"), GenerateHistory = TRUE, GenerateHistoryMax = 1
)
Cn <- "C"
EfmVar <- "L"
outcomemodels <- list("Y~A+L+U+A:L+Lag1A", "Y~A+L+U+A:L+Lag1A", "Y~A+L+U+A:L")
censoringmodel <- c("C~L+U+as.factor(time)")
gestMultiple(data, idvar, timevar, Yn, An, Cn, outcomemodels, propensitymodel,
censoringmodel = censoringmodel, type = 2, EfmVar,
cutoff = 2
)