famos {FAMoS} | R Documentation |
Automated Model Selection
Description
Given a vector containing all parameters of interest and a cost function, the FAMoS
looks for the most appropriate subset model to describe the given data.
Usage
famos(
init.par,
fit.fn,
homedir = getwd(),
do.not.fit = NULL,
method = "forward",
init.model.type = "random",
refit = FALSE,
use.optim = TRUE,
optim.runs = 1,
default.val = NULL,
swap.parameters = NULL,
critical.parameters = NULL,
random.borders = 1,
control.optim = list(maxit = 1000),
parscale.pars = FALSE,
con.tol = 0.1,
save.performance = TRUE,
use.futures = FALSE,
reattempt = FALSE,
log.interval = 600,
interactive.session = TRUE,
verbose = FALSE,
...
)
Arguments
init.par |
A named vector containing the initial parameter values. |
fit.fn |
A cost function. Has to take the complete parameter vector as an input (needs to be names |
homedir |
The directory to which the results should be saved to. |
do.not.fit |
The names of the parameters that are not supposed to be fitted. Default is NULL. |
method |
The starting method of FAMoS. Options are "forward" (forward search), "backward" (backward elimination) and "swap" (only if |
init.model.type |
The starting model. Options are "global" (starts with the complete model), "random" (creates a randomly sampled starting model) or "most.distant" (uses the model most dissimilar from all other previously tested models). Alternatively, a specific model can be used by giving the corresponding names of the parameters one wants to start with. Default to "random". |
refit |
If TRUE, previously tested models will be tested again. Default to FALSE. |
use.optim |
Logical. If true, the cost function |
optim.runs |
The number of times that each model will be optimised. Default to 1. Numbers larger than 1 use random initial conditions (see |
default.val |
A named list containing the values that the non-fitted parameters should take. If NULL, all non-fitted parameters will be set to zero. Default values can be either given by a numeric value or by the name of the corresponding parameter the value should be inherited from (NOTE: In this case the corresponding parameter entry has to contain a numeric value). Default to NULL. |
swap.parameters |
A list specifying which parameters are interchangeable. Each swap set is given as a vector containing the names of the respective parameters. Default to NULL. |
critical.parameters |
A list specifying sets of critical parameters. Critical sets are parameters sets, of which at least one parameter per set has to be present in each tested model. Default to NULL. |
random.borders |
The ranges from which the random initial parameter conditions for all |
control.optim |
Control parameters passed along to |
parscale.pars |
Logical. If TRUE, the |
con.tol |
The absolute convergence tolerance of each fitting run (see Details). Default is set to 0.1. |
save.performance |
Logical. If TRUE, the performance of |
use.futures |
Logical. If TRUE, FAMoS submits model evaluations via |
reattempt |
Logical. If TRUE, FAMoS will jump to a distant model, once the search methods are exhausted and continue from there. The algorithm terminates if the best model is encountered again or if all neighbouring models have been tested. If FALSE (default), FAMOS will terminate once the search methods are exhausted. |
log.interval |
The interval (in seconds) at which FAMoS informs about the current status, i.e. which models are still running and how much time has passed. Default to 600 (= 10 minutes). |
interactive.session |
Logical. If TRUE (default), FAMoS assumes it is running in an interactive session and users can supply input. If FALSE, no input is expected from the user, which can be helpful when running the script non-locally. |
verbose |
Logical. If TRUE, FAMoS will output all details about the current fitting procedure. |
... |
Other arguments that will be passed along to |
Details
In each iteration, FAMoS finds all neighbouring models based on the current model and method, and subsequently tests them. If one of the tested models performs better than the current model, the model, but not the method, will be updated. Otherwise, the method, but not the model, will be adaptively changed, depending on the previously used methods.
The cost function fit.fn
can take the following inputs:
- parms
A named vector containing all parameter values. This input is mandatory. If
use.optim = TRUE
, FAMoS will automatically subset the complete parameter set into fitted and non-fitted parameters.- binary
Optional input. The binary vector contains the information which parameters are currently fitted. Fitted parameters are set to 1, non-fitted to 0. This input can be used to split the complete parameter set into fitted and non-fitted parameters if a customised optimisation function is used (see
use.optim
).- ...
Other parameters that should be passed to
fit.fn
If use.optim = TRUE
, the cost function needs to return a single numeric value, which corresponds to the selection criterion value. However, if use.optim = FALSE
, the cost function needs to return a list containing in its first entry the selection criterion value and in its second entry the named vector of the fitted parameter values (non-fitted parameters are internally assessed).
Value
A list containing the following elements:
- SCV
The value of the selection criterion of the best model.
- par
The values of the fitted parameter vector corresponding to the best model.
- binary
The binary information of the best model.
- vector
Vector indicating which parameters were fitted in the best model.
- total.models.tested
The total number of different models that were analysed. May include repeats.
- mrun
The number of the current FAMoS run.
- initial.model
The first model evaluated by the FAMoS run.
Examples
#setting data
true.p2 <- 3
true.p5 <- 2
sim.data <- cbind.data.frame(range = 1:10,
y = true.p2^2 * (1:10)^2 - exp(true.p5 * (1:10)))
#define initial parameter values and corresponding test function
inits <- c(p1 = 3, p2 = 4, p3 = -2, p4 = 2, p5 = 0)
cost_function <- function(parms, binary, data){
if(max(abs(parms)) > 5){
return(NA)
}
with(as.list(c(parms)), {
res <- p1*4 + p2^2*data$range^2 + p3*sin(data$range) + p4*data$range - exp(p5*data$range)
diff <- sum((res - data$y)^2)
#calculate AICC
nr.par <- length(which(binary == 1))
nr.data <- nrow(data)
AICC <- diff + 2*nr.par + 2*nr.par*(nr.par + 1)/(nr.data - nr.par -1)
return(AICC)
})
}
#set swap set
swaps <- list(c("p1", "p5"))
#perform model selection
famos(init.par = inits,
fit.fn = cost_function,
homedir = tempdir(),
method = "swap",
swap.parameters = swaps,
init.model.type = c("p1", "p3"),
optim.runs = 1,
data = sim.data)
#delete tempdir
unlink(paste0(tempdir(),"/FAMoS-Results"), recursive = TRUE)