port {RISCA} | R Documentation |
POsitivity-Regression Tree (PoRT) Algorithm to Identify Positivity Violations.
Description
This function allows to identify potential posivity violations by using the PoRT algorithm.
Usage
port(group, cov.quanti, cov.quali, data, alpha, beta, gamma, pruning,
minbucket, minsplit, maxdepth)
Arguments
group |
A character string with the name of the exposure in |
cov.quanti |
A character string with the names of the quantitative predictors in |
cov.quali |
A character string with the names of the qualitative predictors in |
data |
A data frame in which to look for the variables related to the treatment/exposure and the predictors. |
alpha |
The minimal proportion of the whole sample size to consider a problematic subgroup. The default value is 0.05. |
beta |
The exposed or unexposed proportion under which one can consider a positivity violation. The default value is 0.05. |
gamma |
The maximum number of predictors used to define the subgroup. The default value is 2. See 'Details'. |
pruning |
If |
minbucket |
An |
minsplit |
An |
maxdepth |
An |
Details
In a first step, the PoRT algorithm estimates one tree for each predictor and memorises the leaves corresponding to problematic subgroups according to the hyperparameters alpha
and beta
(i.e., the subgroup must at least include alpha*100
percent of the whole sample, and the exposure prevalence in the subgroup must be superior to 1-beta
or inferior to beta
). If gamma=1
, the algorithm stops. Otherwise, if at least one problematic subgroup is identified in this first step, the corresponding predictor(s) is(are) not considered in the second step, which estimates one tree for all possible couples of remaining predictors and memorizes the leaves corresponding to problematic subgroups. If gamma=2
, the algorithm stops; otherwise, the third step consists of building one tree for all possible trios of remaining covariates not involved in the previously identified subgroups, etc.
Value
The port
function returns a characters string summarising all the subgroups identified as violating the positivity assumption, and provides for each of these subgroups the exposure prevalence, the subgroup size and the relative subgroup size (with respect to the sample size).
Author(s)
Arthur Chatton <Arthur.Chatton@univ-nantes.fr>
References
Danelian et al. Identification of positivity violations' using regression trees: The PoRT algorithm. Manuscript submitted. 2022.
Examples
data("dataDIVAT2")
# PoRT with default hyperparameters
port(group="ecd", cov.quanti="age", cov.quali=c("hla", "retransplant"),
data=dataDIVAT2)
# Illustration of the 'pruning' argument
port(group="ecd", cov.quanti="age", cov.quali=c("hla", "retransplant"),
data=dataDIVAT2, beta=0.01)
port(group="ecd", cov.quanti="age", cov.quali=c("hla", "retransplant"),
data=dataDIVAT2, beta=0.01, pruning=TRUE)