model_selection {DImodelsVis} | R Documentation |
Visualising model selection
Description
This function helps to visualise model selection by showing a visual comparison between the information criteria for different models. It is also possible to visualise a breakup of the information criteria into deviance (goodness-of-fit) and penalty terms for each model. This could aid in understanding why a parsimonious model could be preferable over a more complex model.
Usage
model_selection(
models,
metric = c("AIC", "BIC", "AICc", "BICc", "deviance"),
sort = FALSE,
breakup = FALSE,
plot = TRUE,
model_names = names(models)
)
Arguments
models |
List of statistical regression model objects. |
metric |
Metric used for comparisons between models. Takes values from c("AIC", "BIC", "AICc", "BICc", "logLik"). Can choose a single or multiple metrics for comparing the different models. |
sort |
A boolean value indicating whether to sort the model from highest to lowest value of chosen metric. |
breakup |
A boolean value indicating whether to breakup the metric value into deviance (defined as -2*loglikelihood) and penalty components. Will work only if a single metric out of "AIC", "AICc", "BIC", or "BICc" is chosen to plot. |
plot |
A boolean variable indicating whether to create the plot or return the prepared data instead. The default 'TRUE' creates the plot while 'FALSE' would return the prepared data for plotting. Could be useful if user wants to modify the data first and then call the plotting |
model_names |
A character string describing the names to display on X-axis for each model in order they appear in the models parameter. |
Value
A ggplot object or data-frame (if 'plot == FALSE')
Examples
library(DImodels)
## Load data
data(sim2)
## Fit different DI models
mod_AV <- DI(prop = 3:6, DImodel = "AV", data = sim2, y = "response")
mod_FULL <- DI(prop = 3:6, DImodel = "FULL", data = sim2, y = "response")
mod_FG <- DI(prop = 3:6, DImodel = "FG", FG = c("G","G","L","L"),
data = sim2, y = "response")
mod_AV_theta <- DI(prop = 3:6, DImodel = "AV", data = sim2,
y = "response", estimate_theta = TRUE)
mod_FULL_theta <- DI(prop = 3:6, DImodel = "FULL", data = sim2,
y = "response", estimate_theta = TRUE)
mod_FG_theta <- DI(prop = 3:6, DImodel = "FG", FG = c("G","G","L","L"),
data = sim2, y = "response", estimate_theta = TRUE)
models_list <- list("AV model" = mod_AV, "Full model" = mod_FULL,
"FG model" = mod_FG, "AV model_t" = mod_AV_theta,
"Full model_t" = mod_FULL_theta,
"FG model_t" = mod_FG_theta)
## Specific metric
model_selection(models = models_list,
metric = c("AIC"))
## Multiple metrics can be plotted together as well
model_selection(models = models_list,
metric = c("AIC", "BIC"))
## If single metric is specified then breakup of metric
## between goodness of fit and penalty can also be visualised
model_selection(models = models_list,
metric = c("AICc"),
breakup = TRUE)
## Sort models
model_selection(models = models_list,
metric = c("AICc"),
breakup = TRUE, sort = TRUE)
## If multiple metrics are specified then sorting
## will be done on first metric specified in list (AIC in this case)
model_selection(models = models_list,
metric = c("AIC", "BIC", "AICc", "BICc"), sort = TRUE)
## If the list specified in models is not named then
## By default the labels on the X-axis for the models will be
## created by assigning a unique ID to each model sequentially
## in the order they appear in the models object
names(models_list) <- NULL
model_selection(models = models_list,
metric = c("AIC", "BIC", "AICc"), sort = TRUE)
## When possible the variables names of objects containing the
## individual models would be used as axis labels
model_selection(models = list(mod_AV, mod_FULL, mod_FG,
mod_AV_theta, mod_FULL_theta, mod_FG_theta),
metric = c("AIC", "BIC"), sort = TRUE)
## If neither of these two situations are desirable custom labels
## for each model can be specified using the model_names parameter
model_selection(models = list(mod_AV, mod_FULL, mod_FG,
mod_AV_theta, mod_FULL_theta, mod_FG_theta),
metric = c("AIC", "BIC"), sort = TRUE,
model_names = c("AV model", "Full model", "FG model",
"AV theta", "Full theta", "FG theta"))
## Specify `plot = FALSE` to not create the plot but return the prepared data
head(model_selection(models = list(mod_AV, mod_FULL, mod_FG,
mod_AV_theta, mod_FULL_theta, mod_FG_theta),
metric = c("AIC", "BIC"), sort = TRUE, plot = FALSE,
model_names = c("AV model", "Full model", "FG model",
"AV theta", "Full theta", "FG theta")))