R: Multivariate Generalized Propensity Score

mvGPS {mvGPS}

R Documentation

Multivariate Generalized Propensity Score

Description

Estimate propensity scores for multivariate continuous exposure by assuming joint normal conditional densities. Simultaneously estimate stabilized inverse probability of treatment weights (IPTW) using joint normal density for marginal distribution of exposure.

Usage

mvGPS(D, C, common = FALSE, trim_w = FALSE, trim_quantile = 0.99)

Arguments

`D`	numeric matrix of dimension `n` by `m` designating values of the exposures
`C`	either a list of numeric matrices of length `m` of dimension `n` by `p_j` designating values of the confounders for each exposure value or if `common` is TRUE a single matrix of of dimension `n` by `p` that represents common confounders for all exposures.
`common`	logical indicator for whether C is a single matrix of common confounders for all exposures. default is FALSE meaning C must be specified as list of confounders of length `m`.
`trim_w`	logical indicator for whether to trim weights. default is FALSE
`trim_quantile`	numeric scalar used to specify the upper quantile to trim weights if applicable. default is 0.99

Details

Generalized propensity scores (GPS) were proposed by Hirano and Imbens (2004) and Imai and Van Dyk (2004) to extend propensity scores to handle continuous exposures. The GPS is constructed using the conditional density of the exposure given a set of confounders. These original methods and the subsequent literature have focused on the case of a single continuous exposure where the GPS could be estimated using normal densities, kernel smoothing (Kennedy et al. 2017), gradient boosting (Zhu et al. 2015), and empirical likelihoods (Fong et al. 2018). In this package we provide an extension to this literature to allow for multivariate continuous exposures to be estimated.

Notation

Assume that we have a set of continuous exposures, D, of length m, i.e., \mathbf{D}=D_{1}, \dots, D_{m} collected on n units. Further, we assume that there exists a set of confounders \mathbf{C}_{1},\dots,\mathbf{C}_{m} for each exposure of length p_{j} for j=1,\dots,m. The confounders are related to both the exposures and the outcome of interest such. Note that the confounders need not be identical for all exposures.

In our function we therefore expect that the argument D is a numeric matrix of dimension n\times m, and that C is a list of length m where each element is a matrix of dimension n\times p_{j}. For the case where we assume that all exposures have common confounders we set common=TRUE and C must be a matrix of dimension n\times p.

mvGPS

We define the multivariate generalized propensity score, mvGPS, as

mvGPS=f_{\mathbf{D}\mid \mathbf{C}_{1},\dots,\mathbf{C}_{m}}

where f represents a joint multivariate conditional density function. For our current development we specify f as multivariate normal, i.e.,

\mathbf{D}\mid \mathbf{C}_{1},\dots,\mathbf{C}_{m}\sim N_{m}(\boldsymbol{\mu}, \Sigma).

Factorizing this joint density we can rewrite the mvGPS as the product of univariate conditional densities, i.e.,

mvGPS=f_{D_{m}\mid \mathbf{C}_{m}, D_{m-1},\dots,D_{1}}\times\cdots\times f_{D_{1}\mid\mathbf{C}_{1}}.

We use this factorized version in our implementation, with parameters for each distribution estimated through least squares.

Constructing Weights

Following Robins et al. (2000), we use the mvGPS to construct stabilized inverse probability of treatment (IPTW) weights. These have been shown to balance confounders and return unbiased estimated of the dose-response. Weights are constructed as

w=\frac{f_{\mathbf{D}}}{f_{\mathbf{D}\mid \mathbf{C}_{1},\dots,\mathbf{C}_{m}}},

where the marginal density f_{\mathbf{D}} of the exposures is assumed to be multivariate normal as well.

Writing the weights using completely factorized densities we have

w=\frac{f_{D_{m}\mid D_{m-1},\dots, D_{1}}\times\cdots\times f_{D_{1}}}{f_{D_{m}\mid \mathbf{C}_{m}, D_{m-1},\dots,D_{1}}\times\cdots\times f_{D_{1}\mid\mathbf{C}_{1}}}.

Trimming

Often when using weights based on the propensity score, practitioners are concerned about the effect of extreme weights. It has been shown that an effective way to protect extreme weights is to trim them at a particular percentile (Crump et al. 2009; Lee et al. 2011). We allow users to specify whether trimmed weights should be returned and at which threshold. To trim weights set trim_w=TRUE and specify the desired percentile as trim_quantile=q. Note that trimming is applied at both the upper and lower percentile thresholds, i.e.,

w^{*}=w 1_{\{Q(w, 1-q)\le w \le Q(w, q)\}}+Q(w, 1-q) 1_{\{w < Q(w, 1-q)\}} + Q(w, q) 1_{\{w > Q(w, q)\}}

Value

list of score and wts, where score is the mvGPS score values and wts are the corresponding stabilized inverse probability of treatment weights

References

Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009). “Dealing with limited overlap in estimation of average treatment effects.” Biometrika, 96(1), 187-199.

Fong C, Hazlett C, Imai K (2018). “Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements.” Annals of Applied Statistics, In-Press.

Hirano K, Imbens GW (2004). “The propensity score with continuous treatments.” In Gelman A, Meng X (eds.), Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives, 73-84.

Imai K, Van Dyk DA (2004). “Causal inference with general treatment regimes: generalizing the propensity score.” Journal of the American Statistical Association, 99(467), 854-866.

Kennedy EH, Ma Z, McHugh MD, Small DS (2017). “Non-parametric methods for doubly robust estimation of continuous treatment effects.” Journal of the Royal Statistical Society: Series B, 79(4), 1229-1245.

Lee BK, Lessler J, Stuart EA (2011). “Weight trimming and propensity score weighting.” PloS One, 6(3).

Robins JM, Hernan MA, Brumback B (2000). “Marginal structural models and causal inference in epidemiology.” Epidemiology, 11(5), 550-560.

Zhu Y, Coffman DL, Ghosh D (2015). “A boosting algorithm for estimating generalized propensity scores with continuous treatments.” Journal of Causal Inference, 3(1), 25-40.

Examples

#generating confounded bivariate exposure
sim_dt <- gen_D(method="u", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, 
C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020)
D <- sim_dt$D
C <- sim_dt$C

#generating weights and mvGPS
out_mvGPS <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3]))

# can apply trimming with default 99th percentile
out_mvGPS_trim <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE)

#or assume all exposures have equivalent confounders
out_mvGPS_common <- mvGPS(D=D, C=C, common=TRUE)

[Package mvGPS version 1.2.2 Index]