R: adjustment on propensity score

ps_adjust {adapt4pv}

R Documentation

adjustment on propensity score

Description

Implement the adjustment on propensity score for all the drug exposures of the input drug matrix x which have more than a given number of co-occurence with the outcome. The binary outcome is regressed on a drug exposure and its estimated PS, for each drug exposure considered after filtering. With this approach, a p-value is obtained for each drug and a variable selection is performed over the corrected for multiple comparisons p-values.

Usage

ps_adjust(
  x,
  y,
  n_min = 3,
  betaPos = TRUE,
  est_type = "bic",
  threshold = 0.05,
  ncore = 1
)

Arguments

`x`	Input matrix, of dimension nobs x nvars. Each row is an observation vector. Can be in sparse matrix format (inherit from class `"sparseMatrix"` as in package `Matrix`).
`y`	Binary response variable, numeric.
`n_min`	Numeric, Minimal number of co-occurence between a drug covariate and the outcome y to estimate its score. See details belows. Default is 3.
`betaPos`	Should the covariates selected by the procedure be positively associated with the outcome ? Default is `TRUE`.
`est_type`	Character, indicates which approach is used to estimate the PS. Could be either "bic", "hdps" or "xgb". Default is "bic".
`threshold`	Threshold for the p-values. Default is 0.05.
`ncore`	The number of calcul units used for parallel computing. Default is 1, no parallelization is implemented.

Details

The PS could be estimated in different ways: using lasso-bic approach, the hdps algorithm or gradient tree boosting. The scores are estimated using the default parameter values of est_ps_bic, est_ps_hdps and est_ps_xgb functions (see documentation for details). We apply the same filter and the same multiple testing correction as in the paper UPCOMING REFERENCE: first, PS are estimated only for drug covariates which have more than n_min co-occurence with the outcome y. Adjustment on the PS is performed for these covariates and one sided or two-sided (depend on betaPos parameter) p-values are obtained. The p-values of the covariates not retained after filtering are set to 1. All these p-values are then adjusted for multiple comparaison with the Benjamini-Yekutieli correction. COULD BE VERY LONG. Since this approach (i) estimate a score for several drug covariates and (ii) perform an adjustment on these scores, parallelization is highly recommanded.

Value

An object with S3 class "ps", "adjust", "*", where "*" is "bic", "hdps" or "xgb"according on how the score were estimated.

`estimates`	Regression coefficients associated with the drug covariates. Numeric, length equal to the number of selected variables with this approach. Some elements could be NA if (i) the corresponding covariate was filtered out, (ii) adjustment model did not converge. Trying to estimate the score in a different way could help, but it's not insured.
`corrected_pvals`	One sided p-values if `betaPos = TRUE`, two-sided p-values if `betaPos = FALSE` adjusted for multiple testing. Numeric, length equal to nvars.
`selected_variables`	Character vector, names of variable(s) selected with the ps-adjust approach. If `betaPos = TRUE`, this set is the covariates with a corrected one-sided p-value lower than `threshold`. Else this set is the covariates with a corrected two-sided p-value lower than `threshold`. Covariates are ordering according to their corrected p-value.

Author(s)

Emeline Courtois
Maintainer: Emeline Courtois emeline.courtois@inserm.fr

References

Benjamini, Y., & Yekuteli, D. (2001). "The Control of the False Discovery Rate in Multiple Testing under Dependency". The Annals of Statistics. 29(4), 1165–1188, doi: doi:10.1214/aos/1013699998.

Examples


set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
adjps <- ps_adjust(x = drugs, y = ae, n_min = 10)

[Package adapt4pv version 0.2-3 Index]