famos {FAMoS}R Documentation

Automated Model Selection

Description

Given a vector containing all parameters of interest and a cost function, the FAMoS looks for the most appropriate subset model to describe the given data.

Usage

famos(
  init.par,
  fit.fn,
  homedir = getwd(),
  do.not.fit = NULL,
  method = "forward",
  init.model.type = "random",
  refit = FALSE,
  use.optim = TRUE,
  optim.runs = 1,
  default.val = NULL,
  swap.parameters = NULL,
  critical.parameters = NULL,
  random.borders = 1,
  control.optim = list(maxit = 1000),
  parscale.pars = FALSE,
  con.tol = 0.1,
  save.performance = TRUE,
  use.futures = FALSE,
  reattempt = FALSE,
  log.interval = 600,
  interactive.session = TRUE,
  verbose = FALSE,
  ...
)

Arguments

init.par

A named vector containing the initial parameter values.

fit.fn

A cost function. Has to take the complete parameter vector as an input (needs to be names parms) and must return a selection criterion value (e.g. AICc or BIC). See Details for more information.

homedir

The directory to which the results should be saved to.

do.not.fit

The names of the parameters that are not supposed to be fitted. Default is NULL.

method

The starting method of FAMoS. Options are "forward" (forward search), "backward" (backward elimination) and "swap" (only if critical.parameters or swap.parameters are supplied). Methods are adaptively changed over each iteration of FAMoS. Default to "forward".

init.model.type

The starting model. Options are "global" (starts with the complete model), "random" (creates a randomly sampled starting model) or "most.distant" (uses the model most dissimilar from all other previously tested models). Alternatively, a specific model can be used by giving the corresponding names of the parameters one wants to start with. Default to "random".

refit

If TRUE, previously tested models will be tested again. Default to FALSE.

use.optim

Logical. If true, the cost function fit.fn will be fitted via optim. If FALSE, the cost function will only be evaluated.

optim.runs

The number of times that each model will be optimised. Default to 1. Numbers larger than 1 use random initial conditions (see random.borders).

default.val

A named list containing the values that the non-fitted parameters should take. If NULL, all non-fitted parameters will be set to zero. Default values can be either given by a numeric value or by the name of the corresponding parameter the value should be inherited from (NOTE: In this case the corresponding parameter entry has to contain a numeric value). Default to NULL.

swap.parameters

A list specifying which parameters are interchangeable. Each swap set is given as a vector containing the names of the respective parameters. Default to NULL.

critical.parameters

A list specifying sets of critical parameters. Critical sets are parameters sets, of which at least one parameter per set has to be present in each tested model. Default to NULL.

random.borders

The ranges from which the random initial parameter conditions for all optim.runs larger than one are sampled. Can be either given as a vector containing the relative deviations for all parameters or as a matrix containing in its first column the lower and in its second column the upper border values. Parameters are uniformly sampled based on runif. Default to 1 (100% deviation of all parameters). Alternatively, functions such as rnorm, rchisq, etc. can be used if the additional arguments are passed along as well.

control.optim

Control parameters passed along to optim. For more details, see optim.

parscale.pars

Logical. If TRUE, the parscale option will be used when fitting with optim. This can help to speed up the fitting procedure, if the parameter values are on different scales. Default to FALSE.

con.tol

The absolute convergence tolerance of each fitting run (see Details). Default is set to 0.1.

save.performance

Logical. If TRUE, the performance of FAMoS will be evaluated in each iteration via famos.performance, which will save the corresponding plots into the folder "FAMoS-Results/Figures/" (starting from iteration 3) and simultaneously show it on screen. Default to TRUE.

use.futures

Logical. If TRUE, FAMoS submits model evaluations via futures. For more information, see the future package.

reattempt

Logical. If TRUE, FAMoS will jump to a distant model, once the search methods are exhausted and continue from there. The algorithm terminates if the best model is encountered again or if all neighbouring models have been tested. If FALSE (default), FAMOS will terminate once the search methods are exhausted.

log.interval

The interval (in seconds) at which FAMoS informs about the current status, i.e. which models are still running and how much time has passed. Default to 600 (= 10 minutes).

interactive.session

Logical. If TRUE (default), FAMoS assumes it is running in an interactive session and users can supply input. If FALSE, no input is expected from the user, which can be helpful when running the script non-locally.

verbose

Logical. If TRUE, FAMoS will output all details about the current fitting procedure.

...

Other arguments that will be passed along to future, optim or the user-specified cost function fit.fn.

Details

In each iteration, FAMoS finds all neighbouring models based on the current model and method, and subsequently tests them. If one of the tested models performs better than the current model, the model, but not the method, will be updated. Otherwise, the method, but not the model, will be adaptively changed, depending on the previously used methods.

The cost function fit.fn can take the following inputs:

parms

A named vector containing all parameter values. This input is mandatory. If use.optim = TRUE, FAMoS will automatically subset the complete parameter set into fitted and non-fitted parameters.

binary

Optional input. The binary vector contains the information which parameters are currently fitted. Fitted parameters are set to 1, non-fitted to 0. This input can be used to split the complete parameter set into fitted and non-fitted parameters if a customised optimisation function is used (see use.optim).

...

Other parameters that should be passed to fit.fn

If use.optim = TRUE, the cost function needs to return a single numeric value, which corresponds to the selection criterion value. However, if use.optim = FALSE, the cost function needs to return a list containing in its first entry the selection criterion value and in its second entry the named vector of the fitted parameter values (non-fitted parameters are internally assessed).

Value

A list containing the following elements:

SCV

The value of the selection criterion of the best model.

par

The values of the fitted parameter vector corresponding to the best model.

binary

The binary information of the best model.

vector

Vector indicating which parameters were fitted in the best model.

total.models.tested

The total number of different models that were analysed. May include repeats.

mrun

The number of the current FAMoS run.

initial.model

The first model evaluated by the FAMoS run.

Examples


#setting data
true.p2 <- 3
true.p5 <- 2
sim.data <- cbind.data.frame(range = 1:10,
                             y = true.p2^2 * (1:10)^2 - exp(true.p5 * (1:10)))

#define initial parameter values and corresponding test function
inits <- c(p1 = 3, p2 = 4, p3 = -2, p4 = 2, p5 = 0)

cost_function <- function(parms, binary, data){
  if(max(abs(parms)) > 5){
    return(NA)
  }
  with(as.list(c(parms)), {
    res <- p1*4 + p2^2*data$range^2 + p3*sin(data$range) + p4*data$range - exp(p5*data$range)
    diff <- sum((res - data$y)^2)

    #calculate AICC
    nr.par <- length(which(binary == 1))
    nr.data <- nrow(data)
    AICC <- diff + 2*nr.par + 2*nr.par*(nr.par + 1)/(nr.data - nr.par -1)

    return(AICC)
  })
}


#set swap set
swaps <- list(c("p1", "p5"))

#perform model selection
famos(init.par = inits,
      fit.fn = cost_function,
      homedir = tempdir(),
      method = "swap",
      swap.parameters = swaps,
      init.model.type = c("p1", "p3"),
      optim.runs = 1,
      data = sim.data)

#delete tempdir
unlink(paste0(tempdir(),"/FAMoS-Results"), recursive = TRUE)

[Package FAMoS version 0.3.0 Index]