lmtp_sdr {lmtp}R Documentation

LMTP Sequential Doubly Robust Estimator

Description

Sequentially doubly robust estimator for the effects of traditional causal effects and modified treatment policies for both point treatment and longitudinal data with binary, continuous, or time-to-event outcomes. Supports binary, categorical, and continuous exposures.

Usage

lmtp_sdr(
  data,
  trt,
  outcome,
  baseline = NULL,
  time_vary = NULL,
  cens = NULL,
  shift = NULL,
  shifted = NULL,
  k = Inf,
  mtp = FALSE,
  outcome_type = c("binomial", "continuous", "survival"),
  id = NULL,
  bounds = NULL,
  learners_outcome = "SL.glm",
  learners_trt = "SL.glm",
  folds = 10,
  weights = NULL,
  control = lmtp_control()
)

Arguments

data

[data.frame]
A data.frame in wide format containing all necessary variables for the estimation problem. Must not be a data.table.

trt

[character] or [list]
A vector containing the column names of treatment variables ordered by time. Or, a list of vectors, the same length as the number of time points of observation. Vectors should contain column names for the treatment variables at each time point. The list should be ordered following the time ordering of the model.

outcome

[character]
The column name of the outcome variable. In the case of time-to-event analysis, a vector containing the columns names of intermediate outcome variables and the final outcome variable ordered by time. Only numeric values are allowed. If the outcome type is binary, data should be coded as 0 and 1.

baseline

[character]
An optional vector containing the column names of baseline covariates to be included for adjustment at every time point.

time_vary

[list]
A list the same length as the number of time points of observation with the column names for new time-varying covariates introduced at each time point. The list should be ordered following the time ordering of the model.

cens

[character]
An optional vector of column names of censoring indicators the same length as the number of time points of observation. If missingness in the outcome is present or if time-to-event outcome, must be provided.

shift

[closure]
A two argument function that specifies how treatment variables should be shifted. See examples for how to specify shift functions for continuous, binary, and categorical exposures.

shifted

[data.frame]
An optional data frame, the same as in data, but modified according to the treatment policy of interest. If specified, shift is ignored.

k

[integer(1)]
An integer specifying how previous time points should be used for estimation at the given time point. Default is Inf, all time points.

mtp

[logical(1)]
Is the intervention of interest a modified treatment policy? Default is FALSE. If treatment variables are continuous this should be TRUE.

outcome_type

[character(1)]
Outcome variable type (i.e., continuous, binomial, survival).

id

[character(1)]
An optional column name containing cluster level identifiers.

bounds

[numeric(2)]
An optional, ordered vector of the bounds for a continuous outcomes. If NULL, the bounds will be taken as the minimum and maximum of the observed data. Should be left as NULL if the outcome type is binary.

learners_outcome

[character]
A vector of SuperLearner algorithms for estimation of the outcome regression. Default is "SL.glm", a main effects GLM.

learners_trt

[character]
A vector of SuperLearner algorithms for estimation of the exposure mechanism. Default is "SL.glm", a main effects GLM. Only include candidate learners capable of binary classification.

folds

[integer(1)]
The number of folds to be used for cross-fitting.

weights

[numeric(nrow(data))]
An optional vector containing sampling weights.

control

[list()]
Output of lmtp_control().

Details

Should mtp = TRUE?

A modified treatment policy (MTP) is an intervention that depends on the natural value of the exposure (the value that the treatment would have taken under no intervention). This differs from other causal effects, such as the average treatment effect (ATE), where an exposure would be increased (or decreased) deterministically. If your intervention of interest adds, subtracts, or multiplies the observed treatment values by some amount, use mtp = TRUE.

Value

A list of class lmtp containing the following components:

estimator

The estimator used, in this case "SDR".

theta

The estimated population LMTP effect.

standard_error

The estimated standard error of the LMTP effect.

low

Lower bound of the 95% confidence interval of the LMTP effect.

high

Upper bound of the 95% confidence interval of the LMTP effect.

eif

The estimated, un-centered, influence function of the estimate.

shift

The shift function specifying the treatment policy of interest.

outcome_reg

An n x Tau + 1 matrix of outcome regression predictions. The mean of the first column is used for calculating theta.

density_ratios

An n x Tau matrix of the estimated, non-cumulative, density ratios.

fits_m

A list the same length as folds, containing the fits at each time-point for each fold for the outcome regression.

fits_r

A list the same length as folds, containing the fits at each time-point for each fold of density ratio estimation.

outcome_type

The outcome variable type.

Examples


  set.seed(56)
  n <- 1000
  W <- rnorm(n, 10, 5)
  A <- 23 + 5*W + rnorm(n)
  Y <- 7.2*A + 3*W + rnorm(n)
  ex1_dat <- data.frame(W, A, Y)

  # Example 1.1
  # Point treatment, continuous exposure, continuous outcome, no loss-to-follow-up
  # Interested in the effect of a modified treatment policy where A is decreased by 15
  # units only among observations whose observed A was above 80.
  # The true value under this intervention is about 513.
  policy <- function(data, x) {
    (data[[x]] > 80)*(data[[x]] - 15) + (data[[x]] <= 80)*data[[x]]
  }

  lmtp_sdr(ex1_dat, "A", "Y", "W", shift = policy,
           outcome_type = "continuous", folds = 2, mtp = TRUE)

  # Example 2.1
  # Longitudinal setting, time-varying continuous exposure bounded by 0,
  # time-varying covariates, and a binary outcome with no loss-to-follow-up.
  # Interested in the effect of a treatment policy where exposure decreases by
  # one unit at every time point if an observations observed exposure is greater
  # than or equal to 2. The true value under this intervention is about 0.305.
  head(sim_t4)

  A <- c("A_1", "A_2", "A_3", "A_4")
  L <- list(c("L_1"), c("L_2"), c("L_3"), c("L_4"))

  policy <- function(data, trt) {
    a <- data[[trt]]
    (a - 1) * (a - 1 >= 1) + a * (a - 1 < 1)
  }

  # BONUS: progressr progress bars!
  progressr::handlers(global = TRUE)

  lmtp_sdr(sim_t4, A, "Y", time_vary = L, shift = policy,
           folds = 2, mtp = TRUE)

  # Example 2.2
  # The previous example assumed that the outcome (as well as the treatment variables)
  # were directly affected by all other nodes in the past. In certain situations,
  # domain specific knowledge may suggest otherwise.
  # This can be controlled using the k argument.
  lmtp_sdr(sim_t4, A, "Y", time_vary = L, shift = policy,
           k = 0, folds = 2, mtp = TRUE)

  # Example 2.3
  # Using the same data as examples 2.1 and 2.2.
  # Now estimating the effect of a dynamic modified treatment policy.
  # creating a dynamic mtp that applies the shift function
  # but also depends on history and the current time
  policy <- function(data, trt) {
    mtp <- function(data, trt) {
      (data[[trt]] - 1) * (data[[trt]] - 1 >= 1) + data[[trt]] * (data[[trt]] - 1 < 1)
    }

    # if its the first time point, follow the same mtp as before
    if (trt == "A_1") return(mtp(data, trt))

    # otherwise check if the time varying covariate equals 1
    ifelse(
      data[[sub("A", "L", trt)]] == 1,
      mtp(data, trt), # if yes continue with the policy
      data[[trt]]     # otherwise do nothing
    )
  }

  lmtp_sdr(sim_t4, A, "Y", time_vary = L, mtp = TRUE,
           k = 0, shift = policy, folds = 2)

  # Example 2.4
  # Using the same data as examples 2.1, 2.2, and 2.3, but now treating the exposure
  # as an ordered categorical variable. To account for the exposure being a
  # factor we just need to modify the shift function (and the original data)
  # so as to respect this.
  tmp <- sim_t4
  for (i in A) {
    tmp[[i]] <- factor(tmp[[i]], levels = 0:5, ordered = TRUE)
  }

  policy <- function(data, trt) {
    out <- list()
    a <- data[[trt]]
    for (i in 1:length(a)) {
      if (as.character(a[i]) %in% c("0", "1")) {
        out[[i]] <- as.character(a[i])
      } else {
        out[[i]] <- as.numeric(as.character(a[i])) - 1
      }
    }
    factor(unlist(out), levels = 0:5, ordered = TRUE)
  }

  lmtp_sdr(tmp, A, "Y", time_vary = L, shift = policy,
           k = 0, folds = 2, mtp = TRUE)

  # Example 3.1
  # Longitudinal setting, time-varying binary treatment, time-varying covariates
  # and baseline covariates with no loss-to-follow-up. Interested in a "traditional"
  # causal effect where treatment is set to 1 at all time points for all observations.
  if (require("twang")) {
    data("iptwExWide", package = "twang")

    A <- paste0("tx", 1:3)
    W <- c("gender", "age")
    L <- list(c("use0"), c("use1"), c("use2"))

    lmtp_sdr(
      iptwExWide, A, "outcome", baseline = W, time_vary = L,
      shift = static_binary_on, outcome_type = "continuous",
      mtp = FALSE, folds = 2
    )
  }

  # Example 4.1
  # Longitudinal setting, time-varying continuous treatment, time-varying covariates,
  # binary outcome with right censoring. Interested in the mean population outcome under
  # the observed exposures in a hypothetical population with no loss-to-follow-up.
  head(sim_cens)

  A <- c("A1", "A2")
  L <- list(c("L1"), c("L2"))
  C <- c("C1", "C2")
  Y <- "Y"

  lmtp_sdr(sim_cens, A, Y, time_vary = L, cens = C, shift = NULL, folds = 2)

  # Example 5.1
  # Time-to-event analysis with a binary time-invariant exposure. Interested in
  # the effect of treatment being given to all observations on the cumulative
  # incidence of the outcome.
  # For a survival problem, the outcome argument now takes a vector of outcomes
  # if an observation experiences the event prior to the end of follow-up, all future
  # outcome nodes should be set to 1 (i.e., last observation carried forward).
  A <- "trt"
  Y <- paste0("Y.", 1:6)
  C <- paste0("C.", 0:5)
  W <- c("W1", "W2")

  lmtp_sdr(sim_point_surv, A, Y, W, cens = C, folds = 2,
           shift = static_binary_on, outcome_type = "survival")

  # Example 6.1
  # Intervening on multiple exposures simultaneously. Interested in the effect of
  # a modified treatment policy where variable D1 is decreased by 0.1 units and
  # variable D2 is decreased by 0.5 units simultaneously.
  A <- list(c("D1", "D2"))
  W <- paste0("C", 1:3)
  Y <- "Y"

  d <- function(data, a) {
    out = list(
      data[[a[1]]] - 0.1,
      data[[a[2]]] - 0.5
    )
    setNames(out, a)
  }

  lmtp_sdr(multivariate_data, A, Y, W, shift = d,
           outcome_type = "continuous", folds = 1, mtp = TRUE)


[Package lmtp version 1.4.0 Index]