propensity_score {GenericML}R Documentation

Propensity score estimation


Estimates the propensity scores Pr[D=1Z]Pr[D = 1 | Z] for binary treatment assignment DD and covariates ZZ. Either done by taking the empirical mean of DD (which should equal roughly 0.5, since we assume a randomized experiment), or by direct machine learning estimation.


propensity_score(Z, D, estimator = "constant")



A numeric design matrix that holds the covariates in its columns.


A binary vector of treatment assignment. Value one denotes assignment to the treatment group and value zero assignment to the control group.


Character specifying the estimator. Must either be equal to 'constant' (estimates the propensity scores by mean(D)), 'lasso', 'random_forest', 'tree', or mlr3 syntax. Note that in case of mlr3 syntax, do not specify if the learner is a regression learner or classification learner. Example: 'mlr3::lrn("ranger", num.trees = 500)' for a random forest learner. Note that this is a string and the absence of the classif. or regr. keywords. See for a list of mlr3 learners.


The specifications "lasso", "random_forest", and "tree" in estimator correspond to the following mlr3 specifications (we omit the keywords classif. and regr.). "lasso" is a cross-validated Lasso estimator, which corresponds to 'mlr3::lrn("cv_glmnet", s = "lambda.min", alpha = 1)'. "random_forest" is a random forest with 500 trees, which corresponds to 'mlr3::lrn("ranger", num.trees = 500)'. "tree" is a tree learner, which corresponds to 'mlr3::lrn("rpart")'.


An object of class "propensity_score", consisting of the following components:


A numeric vector of propensity score estimates.


"mlr3" objects used for estimation. Only non-empty if mlr3 was used.


Rosenbaum P.R., Rubin D.B. (1983). “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika, 70(1), 41–55. doi: 10.1093/biomet/70.1.41.

Lang M., Binder M., Richter J., Schratz P., Pfisterer F., Coors S., Au Q., Casalicchio G., Kotthoff L., Bischl B. (2019). “mlr3: A Modern Object-Oriented Machine Learning Framework in R.” Journal of Open Source Software, 4(44), 1903. doi: 10.21105/joss.01903.


## generate data
n  <- 100                        # number of observations
p  <- 5                          # number of covariates
D  <- rbinom(n, 1, 0.5)          # random treatment assignment
Z  <- matrix(runif(n*p), n, p)   # design matrix

## estimate propensity scores via mean(D)...
propensity_score(Z, D, estimator = "constant")

## ... and via SVM with cache size 40
  propensity_score(Z, D,
   estimator = 'mlr3::lrn("svm", cachesize = 40)')

[Package GenericML version 0.2.2 Index]