BuildRule {DevTreatRules} | R Documentation |
Build a Treatment Rule
Description
Perform principled development of a treatment rule (using the IPW approach to account for potential confounding) on a development dataset (i.e. training set) that is independent of datasets used for model selection (i.e. validation set) and rule evaluation (i.e. test set).
Usage
BuildRule(
development.data,
study.design,
prediction.approach,
name.outcome,
type.outcome,
name.treatment,
names.influencing.treatment = NULL,
names.influencing.rule,
desirable.outcome,
rule.method = NULL,
propensity.method,
additional.weights = rep(1, nrow(development.data)),
truncate.propensity.score = TRUE,
truncate.propensity.score.threshold = 0.05,
type.observation.weights = NULL,
propensity.k.cv.folds = 10,
rule.k.cv.folds = 10,
lambda.choice = c("min", "1se"),
OWL.lambda.seq = NULL,
OWL.kernel = "linear",
OWL.kparam.seq = NULL,
OWL.cvFolds = 10,
OWL.verbose = TRUE,
OWL.framework.shift.by.min = TRUE,
direct.interactions.center.continuous.Y = TRUE,
direct.interactions.exclude.A.from.penalty = TRUE
)
Arguments
development.data |
A data frame representing the *development* dataset (i.e. training set) used for building a treatment rule. |
study.design |
Either ‘observational’, ‘RCT’, or ‘naive’. For the |
prediction.approach |
One of ‘split.regression’, ‘direct.interactions’, ‘OWL’, or ‘OWL.framework’. |
name.outcome |
A character indicating the name of the outcome variable in |
type.outcome |
Either ‘binary’ or ‘continuous’, the form of |
name.treatment |
A character indicating the name of the treatment variable in |
names.influencing.treatment |
A character vector (or single element) indicating the names of the variables in |
names.influencing.rule |
A character vector (or single element) indicating the names of the variables in |
desirable.outcome |
A logical equal to |
rule.method |
One of ‘glm.regression’, ‘lasso’, or ‘ridge’. For |
propensity.method |
One of ‘logistic.regression’, ‘lasso’, or ‘ridge’. This is the underlying regression model used to estimate propensity scores for |
additional.weights |
A numeric vector of observation weights that will be multiplied by IPW weights in the rule development stage, with length equal to the number of rows in |
truncate.propensity.score |
A logical variable dictating whether estimated propensity scores less than |
truncate.propensity.score.threshold |
A numeric value between 0 and 0.25. |
type.observation.weights |
Default is NULL, but other choices are ‘IPW.L’, ‘IPW.L.and.X’, and ‘IPW.ratio’, where L indicates |
propensity.k.cv.folds |
An integer specifying how many folds to use for K-fold cross-validation that chooses the tuning parameters when |
rule.k.cv.folds |
An integer specifying how many folds to use for K-fold cross-validation that chooses the tuning parameter when |
lambda.choice |
Either ‘min’ or ‘1se’, corresponding to the |
OWL.lambda.seq |
Used when |
OWL.kernel |
Used when |
OWL.kparam.seq |
Used when |
OWL.cvFolds |
Used when |
OWL.verbose |
Used when |
OWL.framework.shift.by.min |
Logical, set to |
direct.interactions.center.continuous.Y |
Logical, set to |
direct.interactions.exclude.A.from.penalty |
Logical, set to |
Value
A list with some combination of the following components (depending on specified prediction.approach
)
-
type.outcome
: Thetype.outcome
specified above (used by other functions that are based onBuildRule()
) -
prediction.approach
: Theprediction.approach
specified above (used by other functions that are based onBuildRule()
) -
rule.method
: Therule.method
specified above (used by other functions that are based onBuildRule()
) -
lambda.choice
: Thelambda.choice
specified above (used by other functions that are based onBuildRule()
) -
propensity.score.object
: A list containing the relevant regression object from propensity score estimation. The list has two elements fortype.observation.weights=
‘IPW.ratio’ (the default forprediction.approach=
‘split.regression’), has one element fortype.observation.weights=
‘IPW.L’ (the default for ‘OWL’, ‘OWL.framework’ and ‘direct.interactions’), has one element whentype.observation.weights=
‘IPW.L.and.X’, and is simply equal to NA ifstudy.design=
‘RCT’ (in which case propensity score would just be the inverse of sample proportion receiving treatment). -
owl.object
: Forprediction.approach=
‘OWL’ only, the object returned by theowl()
function in theDynTxRegime
package. -
observation.weights
: The observation weights used for estimating the treatment rule -
rule.object
: Forprediction.approach=
‘OWL.framework’ orprediction.approach=
‘direct.interactions’, the regression object returned from treatment rule estimation (to which thecoef()
function could be applied, for example) -
rule.object.control
: Forprediction.approach=
‘split.regression’ the regression object returned from treatment rule estimation (to which thecoef()
function could be applied, for example) that estimates the outcome variable for individuals who do not receive treatment. -
rule.object.treatment
: Forprediction.approach=
‘split.regression’ the regression object returned from treatment rule estimation (to which thecoef()
function could be applied, for example) that estimates the outcome variable for individuals who do receive treatment.
References
Yingqi Zhao, Donglin Zeng, A. John Rush & Michael R. Kosorok (2012) Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107:499 1106–1118.
Shuai Chen, Lu Tian, Tianxi Cai, Menggang Yu (2017) A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics, 73:4: 1199–1209.
Lu Tian, Ash A. Alizadeh, Andrew J. Gentles, Robert Tibshirani (2014) A simple method for estimating interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109:508: 1517–1532.
Jeremy Roth and Noah Simon (2019). Using propensity scores to develop and evaluate treatment rules with observational data (Manuscript in progress)
Jeremy Roth and Noah Simon (2019). Elucidating outcome-weighted learning and its comparison to split-regression: direct vs. indirect methods in practice. (Manuscript in progress)
Examples
set.seed(123)
example.split <- SplitData(data=obsStudyGeneExpressions,
n.sets=3, split.proportions=c(0.5, 0.25, 0.25))
development.data <- example.split[example.split$partition == "development",]
one.rule <- BuildRule(development.data=development.data,
study.design="observational",
prediction.approach="split.regression",
name.outcome="no_relapse",
type.outcome="binary",
desirable.outcome=TRUE,
name.treatment="intervention",
names.influencing.treatment=c("prognosis", "clinic", "age"),
names.influencing.rule=c("age", paste0("gene_", 1:10)),
propensity.method="logistic.regression",
rule.method="glm.regression")
coef(one.rule$rule.object.control)
coef(one.rule$rule.object.treatment)