validate.subgroup {personalized} | R Documentation |
Validating fitted subgroup identification models
Description
Validates subgroup treatment effects for fitted subgroup identification model class of Chen, et al (2017)
Usage
validate.subgroup(
model,
B = 50L,
method = c("training_test_replication", "boot_bias_correction"),
train.fraction = 0.75,
benefit.score.quantiles = c(0.1666667, 0.3333333, 0.5, 0.6666667, 0.8333333),
parallel = FALSE
)
Arguments
model |
fitted model object returned by |
B |
integer. number of bootstrap replications or refitting replications. |
method |
validation method. |
train.fraction |
fraction (between 0 and 1) of samples to be used for training in
training/test replication. Only used for |
benefit.score.quantiles |
a vector of quantiles (between 0 and 1) of the benefit score values for which to return bootstrapped information about the subgroups. ie if one of the quantile values is 0.5, the median value of the benefit scores will be used as a cutoff to determine subgroups and summary statistics will be returned about these subgroups |
parallel |
Should the loop over replications be parallelized? If |
Details
Estimates of various quantities conditional on subgroups and treatment statuses are provided and displayed
via the print.subgroup_validated
function:
"Conditional expected outcomes" The first results shown when printing a
subgroup_validated
object are estimates of the expected outcomes conditional on the estimated subgroups (i.e. which subgroup is 'recommended' by the model) and conditional on treatment/intervention status. If there are two total treatment options, this results in a 2x2 table of expected conditional outcomes."Treatment effects conditional on subgroups" The second results shown when printing a
subgroup_validated
object are estimates of the expected outcomes conditional on the estimated subgroups. If the treatment takes levelsj \in \{1, \dots, K\}
, a total ofK
conditional treatment effects will be shown. For example, of the outcome is continuous, thej
th conditional treatment effect is defined asE(Y|Trt = j, Subgroup=j) - E(Y|Trt = j, Subgroup =/= j)
, whereSubgroup=j
if treatmentj
is recommended, i.e. treatmentj
results in the largest/best expected potential outcomes given the fitted model."Overall treatment effect conditional on subgroups " The third quantity displayed shows the overall improvement in outcomes resulting from all treatment recommendations. This is essentially an average over all of the conditional treatment effects weighted by the proportion of the population recommended each respective treatment level.
Value
An object of class "subgroup_validated"
avg.results |
Estimates of average conditional treatment effects when
subgroups are determined based on the provided cutoff value for the benefit score. For example,
if |
se.results |
Standard errors of the estimates from |
boot.results |
Contains the individual results for each replication. |
avg.quantile.results |
Estimates of average conditional treatment effects when
subgroups are determined based on different quntile cutoff values for the benefit score. For example,
if |
se.quantile.results |
Standard errors corresponding to |
boot.results.quantiles |
Contains the individual results for each replication. |
family |
Family of the outcome. For example, |
method |
Method used for subgroup identification model. Weighting or A-learning |
n.trts |
The number of treatment levels |
comparison.trts |
All treatment levels other than the reference level |
reference.trt |
The reference level for the treatment. This should usually be the control group/level |
larger.outcome.better |
If larger outcomes are preferred for this model |
cutpoint |
Benefit score cutoff value used for determining subgroups |
val.method |
Method used for validation |
iterations |
Number of replications used in the validation process |
nobs |
Number of observations in |
nvars |
Number of variables in |
References
Chen, S., Tian, L., Cai, T. and Yu, M. (2017), A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics. doi:10.1111/biom.12676
Harrell, F. E., Lee, K. L., and Mark, D. B. (1996). Tutorial in biostatistics multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in medicine, 15, 361-387. doi:10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4
Huling. J.D. and Yu, M. (2021), Subgroup Identification Using the personalized Package. Journal of Statistical Software 98(5), 1-60. doi:10.18637/jss.v098.i05
See Also
fit.subgroup
for function which fits subgroup identification models,
plot.subgroup_validated
for plotting of validation results, and
print.subgroup_validated
for arguments for printing options for validate.subgroup()
.
Examples
library(personalized)
set.seed(123)
n.obs <- 500
n.vars <- 20
x <- matrix(rnorm(n.obs * n.vars, sd = 3), n.obs, n.vars)
# simulate non-randomized treatment
xbetat <- 0.5 + 0.5 * x[,11] - 0.5 * x[,13]
trt.prob <- exp(xbetat) / (1 + exp(xbetat))
trt01 <- rbinom(n.obs, 1, prob = trt.prob)
trt <- 2 * trt01 - 1
# simulate response
delta <- 2 * (0.5 + x[,2] - x[,3] - x[,11] + x[,1] * x[,12])
xbeta <- x[,1] + x[,11] - 2 * x[,12]^2 + x[,13]
xbeta <- xbeta + delta * trt
# continuous outcomes
y <- drop(xbeta) + rnorm(n.obs, sd = 2)
# create function for fitting propensity score model
prop.func <- function(x, trt)
{
# fit propensity score model
propens.model <- cv.glmnet(y = trt,
x = x, family = "binomial")
pi.x <- predict(propens.model, s = "lambda.min",
newx = x, type = "response")[,1]
pi.x
}
subgrp.model <- fit.subgroup(x = x, y = y,
trt = trt01,
propensity.func = prop.func,
loss = "sq_loss_lasso",
# option for cv.glmnet,
# better to use 'nfolds=10'
nfolds = 3)
x.test <- matrix(rnorm(10 * n.obs * n.vars, sd = 3), 10 * n.obs, n.vars)
# simulate non-randomized treatment
xbetat.test <- 0.5 + 0.5 * x.test[,11] - 0.5 * x.test[,13]
trt.prob.test <- exp(xbetat.test) / (1 + exp(xbetat.test))
trt01.test <- rbinom(10 * n.obs, 1, prob = trt.prob.test)
trt.test <- 2 * trt01.test - 1
# simulate response
delta.test <- 2 * (0.5 + x.test[,2] - x.test[,3] - x.test[,11] + x.test[,1] * x.test[,12])
xbeta.test <- x.test[,1] + x.test[,11] - 2 * x.test[,12]^2 + x.test[,13]
xbeta.test <- xbeta.test + delta.test * trt.test
y.test <- drop(xbeta.test) + rnorm(10 * n.obs, sd = 2)
valmod <- validate.subgroup(subgrp.model, B = 2,
method = "training_test",
train.fraction = 0.75)
valmod
print(valmod, which.quant = c(4, 5))