CForBenefit {EpiForsk}R Documentation

c-for-benefit

Description

Calculates the c-for-benefit, as proposed by D. van Klaveren et al. (2018), by matching patients based on patient characteristics.

Usage

CForBenefit(
  forest,
  match = c("covariates", "CATE"),
  match_method = "nearest",
  match_distance = "mahalanobis",
  tau_hat_method = c("risk_diff", "tau_avg"),
  CI = c("simple", "bootstrap", "none"),
  level = 0.95,
  n_bootstraps = 999L,
  time_limit = Inf,
  time_limit_CI = Inf,
  verbose = TRUE,
  Y = NULL,
  W = NULL,
  X = NULL,
  p_0 = NULL,
  p_1 = NULL,
  tau_hat = NULL,
  ...
)

Arguments

forest

An object of class causal_forest, as returned by causal_forest().

match

character, "covariates" to match on covariates or "CATE" to match on estimated CATE.

match_method

see matchit.

match_distance

see matchit.

tau_hat_method

character, "risk_diff" to calculate the expected treatment effect in matched groups as the risk under treatment for the treated subject minus the risk under control for the untreated subject. "tau_avg" to calculate it as the average treatment effect of matched subject.

CI

character, "none" for no confidence interval, "simple" to use a normal approximation, and "bootstrap" to use the bootstrap.

level

numeric, confidence level of the confidence interval.

n_bootstraps

numeric, number of bootstraps to use for the bootstrap confidence interval computation.

time_limit

numeric, maximum allowed time to compute C-for-benefit. If limit is reached, execution stops.

time_limit_CI

numeric, maximum time allowed to compute the bootstrap confidence interval. If limit is reached, the user is asked if execution should continue or be stopped.

verbose

boolean, TRUE to display progress bar, FALSE to not display progress bar.

Y

a vector of outcomes. If provided, replaces forest$Y.orig.

W

a vector of treatment assignment; 1 for active treatment; 0 for control If provided, replaces forest$W.orig.

X

a matrix of patient characteristics. If provided, replaces forest$X.orig.

p_0

a vector of outcome probabilities under control.

p_1

a vector of outcome probabilities under active treatment.

tau_hat

a vector of individualized treatment effect predictions. If provided, replaces forest$predictions.

...

additional arguments for matchit.

Details

The c-for-benefit statistic is inspired by the c-statistic used with prediction models to measure discrimination. The c-statistic takes all pairs of observations discordant on the outcome, and calculates the proportion of these where the subject with the higher predicted probability was the one who observed the outcome. In order to extend this to treatment effects, van Klaveren et al. suggest matching a treated subject to a control subject on the predicted treatments effect (or alternatively the covariates) and defining the observed effect as the difference between the outcomes of the treated subject and the control subject. The c-for-benefit statistic is then defined as the proportion of matched pairs with unequal observed effect in which the subject pair receiving greater treatment effect also has the highest expected treatment effect.
When calculating the expected treatment effect, van Klaveren et al. use the average CATE from the matched subjects in a pair (tau_hat_method = "mean"). However, this doesn't match the observed effect used, unless the baseline risks are equal. The observed effect is the difference between the observed outcome for the subject receiving treatment and the observed outcome for the subject receiving control. Their outcomes are governed by the exposed risk and the baseline risk respectively. The baseline risks are ideally equal when covariate matching, although instability of the forest estimates can cause significantly different baseline risks due to non-exact matching. When matching on CATE, we should not expect baseline risks to be equal. Instead, we can more closely match the observed treatment effect by using the difference between the exposed risk for the subject receiving treatment and the baseline risk of the subject receiving control (tau_hat_method = "treatment").

Value

a list with the following components:

Author(s)

KIJA

Examples


n <- 800
p <- 3
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.5)
event_prob <- 1 / (1 + exp(2 * (pmax(2 * X[, 1], 0) * W - X[, 2])))
Y <- rbinom(n, 1, event_prob)
cf <- grf::causal_forest(X, Y, W)
CB_out <- CForBenefit(
forest = cf, CI = "bootstrap", n_bootstraps = 20L, verbose = TRUE,
match_method = "nearest", match_distance = "mahalanobis"
)



[Package EpiForsk version 0.1.1 Index]