R: Construction of distance covariance optimal weights weights

independence_weights {independenceWeights}

R Documentation

Construction of distance covariance optimal weights weights

Description

Constructs independence-inducing weights (distance covariance optimal weights) for estimation of causal quantities for continuous-valued treatments

Usage

independence_weights(
  A,
  X,
  lambda = 0,
  decorrelate_moments = FALSE,
  preserve_means = FALSE,
  dimension_adj = TRUE
)

Arguments

`A`	vector indicating the value of the treatment or exposure variable. Should be a numeric vector.
`X`	matrix of covariates with number of rows equal to the length of `A` and each column is a pre-treatment covariate to be balanced between treatment groups.
`lambda`	tuning parameter for the penalty on the sum of squares of the weights
`decorrelate_moments`	logical scalar. Whether or not to add constraints that result in exact decorrelation of weighted first order moments of `X` and `A`. Defaults to `FALSE`.
`preserve_means`	logical scalar. Whether or not to add constraints that result in exact preservation of weighted first order moments of `X` and `A`. Defaults to `FALSE`.
`dimension_adj`	logical scalar. Whether or not to add adjustment to energy distance terms that account for the dimensionality of `X`. Defaults to `TRUE`.

Value

An object of class "independence_weights" with elements:

`weights`	A vector of length `nrow(X)` containing the estimated sample weights
`A`	Treatment vector
`opt`	The optimization object returned by `osqp::solve_osqp()`
`objective`	The value of the objective function at its optimal value. This is the weighted dependence statistic plus any ridge penalty on the weights.
`D_unweighted`	The value of the weighted dependence distance using all weights = 1 (i.e. unweighted)
`D_w`	The value of the weighted dependence distance of Huling, et al. (2021) using the optimal estimated weights. This is the weighted dependence statistic without the ridge penalty on the weights.
`distcov_unweighted`	The unweighted distance covariance term. This is the standard distance covariance of Szekely et al (2007). This term is always equal to `D_unweighted`.
`distcov_weighted`	The weighted distance covariance term. This term itself does not directly measure weighted dependence but is a critical component of it.
`energy_A`	The weighted energy distance between `A` and its weighted version
`energy_X`	The weighted energy distance between `X` and its weighted version
`ess`	The estimated effective sample size of the weights using Kish's effective sample size formula.

An object of class "independence_weights".

`weights`	the estimated weights, the distance covariance optimal weights (DCOWs)
`A`	the treatment vector
`opt`	the object returned by whatever optimization routine was used
`objective`	the value of the optimized objective function
`distcov_unweighted`	the unweighted distance covariance between treatment and covariates
`distcov_weighted`	the weighted distance covariance between treatment and covariates
`energy_A`	the (energy) distance between the treatment distribution and the weighted treatment distribution. Smaller values mean the marginal distribution of the treatment is preserved after weighting
`energy_x`	the (energy) distance between the covariate distribution and the weighted covariate distribution. Smaller values mean the marginal distribution of the covariates is preserved after weighting
`ess`	the expected sample size after weighting. Kish's approximation is used

References

Szekely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Annals of Statistics 35(6) 2769-2794 doi: 10.1214/009053607000000505

Huling, J. D., Greifer, N., & Chen, G. (2021). Independence weights for causal inference with continuous exposures. arXiv preprint arXiv:2107.07086. https://arxiv.org/abs/2107.07086

Examples


simdat <- simulate_confounded_data(seed = 999, nobs = 500)

y <- simdat$data$Y
A <- simdat$data$A
X <- as.matrix(simdat$data[c("Z1", "Z2", "Z3", "Z4", "Z5")])

dcows <- independence_weights(A, X)

print(dcows)

# distribution of response:
quantile(y)

## create grid
trt_vec <- seq(min(simdat$data$A), 50, length.out=500)

## estimate ADRF
adrf_hat <- weighted_kernel_est(A, y, dcows$weights, trt_vec)$est

## estimate naively without weights
adrf_hat_unwtd <- weighted_kernel_est(A, y, rep(1, length(y)), trt_vec)$est

ylims <- range(c(simdat$data$Y, simdat$true_adrf(trt_vec)))
plot(x = simdat$data$A, y = simdat$data$Y, ylim = ylims, xlim = c(0,50))
## true ADRF
lines(x = trt_vec, y = simdat$true_adrf(trt_vec), col = "blue", lwd=2)
## estimated ADRF
lines(x = trt_vec, y = adrf_hat, col = "red", lwd=2)
## naive estimate
lines(x = trt_vec, y = adrf_hat_unwtd, col = "green", lwd=2)

[Package independenceWeights version 0.0.1 Index]