pred_update {predRupdate} | R Documentation |
Perform Model Updating on an Existing Prediction Model
Description
This function takes an existing (previously developed) prediction model and applies various model updating methods to tailor/adapt it to a new dataset. Various levels of updating are possible, ranging from model re-calibration to model refit.
Usage
pred_update(
x,
update_type = c("intercept_update", "recalibration", "refit"),
new_data,
binary_outcome = NULL,
survival_time = NULL,
event_indicator = NULL
)
Arguments
x |
an object of class " |
update_type |
character variable specifying the level of updating that is required. |
new_data |
data.frame upon which the prediction models should be updated. |
binary_outcome |
Character variable giving the name of the column in
|
survival_time |
Character variable giving the name of the column in
|
event_indicator |
Character variable giving the name of the column in
|
Details
This function takes a single existing (previously estimated) prediction model, and apply various model discrete model updating methods (see Su et al. 2018) to tailor the model to a new dataset.
The type of updating method is selected with the update_type
parameter, with options: "intercept_update", "recalibration" and "refit".
"intercept_update" corrects the overall calibration-in-the-large of the
model, through altering the model intercept (or baseline hazard) to suit
the new dataset. This is achieved by fitting a logistic model (if the
existing model is of type logistic) or time-to-event model (if the existing
model if of type survival) to the new dataset, with the linear predictor as
the only covariate, with the coefficient fixed at unity (i.e. as an
offset). "recalibration" corrects the calibration-in-the-large and any
under/over-fitting, by fitting a logistic model (if the existing model is
of type logistic) or time-to-event model (if the existing model if of type
survival) to the new dataset, with the linear predictor as the only
covariate. Finally, "refit" takes the original model structure and
re-estimates all coefficients; this has the effect as re-developing the
original model in the new data.
new_data
should be a data.frame, where each row should be an
observation (e.g. patient) and each variable/column should be a predictor
variable. The predictor variables need to include (as a minimum) all of the
predictor variables that are included in the existing prediction model
(i.e., each of the variable names supplied to
pred_input_info
, through the model_info
parameter,
must match the name of a variables in new_data
).
Any factor variables within new_data
must be converted to dummy
(0/1) variables before calling this function. dummy_vars
can
help with this. See pred_predict
for examples.
binary_outcome
, survival_time
and event_indicator
are
used to specify the outcome variable(s) within new_data
(use
binary_outcome
if x$model_type
= "logistic", or use
survival_time
and event_indicator
if x$model_type
=
"survival").
Value
A object of class "predUpdate
". This is the same as that
detailed in pred_input_info
, with the added element
containing the estimates of the model updating and the update type.
References
Su TL, Jaki T, Hickey GL, Buchan I, Sperrin M. A review of statistical updating methods for clinical prediction models. Stat Methods Med Res. 2018 Jan;27(1):185-197. doi: 10.1177/0962280215626466.
See Also
Examples
#Example 1 - update time-to-event model by updating the baseline hazard in new dataset
model1 <- pred_input_info(model_type = "survival",
model_info = SYNPM$Existing_TTE_models[1,],
cum_hazard = SYNPM$TTE_mod1_baseline)
recalibrated_model1 <- pred_update(x = model1,
update_type = "intercept_update",
new_data = SYNPM$ValidationData,
survival_time = "ETime",
event_indicator = "Status")
summary(recalibrated_model1)