R: Function for fitting several sequential sampling confidence...

fitRTConfModels {dynConfiR}

R Documentation

Function for fitting several sequential sampling confidence models in parallel

Description

This function is a wrapper of the function fitConfModel (see there for more information). It calls the function for every possible combination of model and participant in model and data respectively. Also, see dWEV, d2DSD, dDDMConf, and dRM for more information about the parameters.

Usage

fitRTConfModels(data, models = c("dynaViTE", "2DSD", "PCRMt"),
  nRatings = NULL, fixed = list(sym_thetas = FALSE), restr_tau = Inf,
  grid_search = TRUE, opts = list(), optim_method = "bobyqa",
  logging = FALSE, precision = 1e-05, parallel = TRUE, n.cores = NULL,
  ...)

Arguments

`data`	a `data.frame` where each row is one trial, containing following variables (column names can be changed by passing additional arguments of the form `condition="contrast"`): `condition` (not necessary; for different levels of stimulus quality, will be transformed to a factor), `rating` (discrete confidence judgments, should be given as integer vector; otherwise will be transformed to integer), `rt` (giving the reaction times for the decision task), either 2 of the following (see details for more information about the accepted formats): `stimulus` (encoding the stimulus category in a binary choice task), `response` (encoding the decision response), `correct` (encoding whether the decision was correct; values in 0, 1) `sbj` (giving the subject ID; the models given in the second argument are fitted for each subject individually. (Furthermore, if `logging = TRUE`, the ID is used in files saved with interim results and logging messages.))
`models`	character vector with following possible elements "dynWEV", "2DSD", "IRM", "PCRM", "IRMt", and "PCRMt" for the models to be fit.
`nRatings`	integer. Number of rating categories. If `NULL`, the maximum of `rating` and `length(unique(rating))` is used. This argument is especially important for data sets where not the whole range of rating categories is realized. If given, ratings has to be given as factor or integer.
`fixed`	list. List with parameter value pairs for parameters that should not be fitted. (see Details).
`restr_tau`	numerical or `Inf` or `"simult_conf"`. Used for 2DSD and dynWEV only. Upper bound for tau. Fits will be in the interval (0,`restr_tau`). If `FALSE` tau will be unbound. For `"simult_conf"`, see the documentation of `d2DSD` and `dWEV`
`grid_search`	logical. If `FALSE`, the grid search before the optimization algorithm is omitted. The fitting is then started with a mean parameter set from the default grid. (Default: `TRUE`)
`opts`	list. A list for more control options in the optimization routines (depending on the `optim_method`). See details for more information.
`optim_method`	character. Determines which optimization function is used for the parameter estimation. Either `"bobyqa"` (default), `"L-BFGS-B"` or `"Nelder-Mead"`. `"bobyqa"` uses a box-constrained optimization with quadratic interpolation. (See `bobyqa` for more information.) The first two use a box-constraint optimization. For Nelder-Mead a transfinite function rescaling is used (i.e. the constrained arguments are suitably transformed to the whole real line).
`logging`	logical. If `TRUE`, a folder 'autosave/fitmodel' is created and messages about the process are printed in a logging file and to console (depending on OS). Additionally intermediate results are saved in a `.RData` file with the participant ID in the name.
`precision`	numerical scalar. For 2DSD and dynWEV only. Precision of calculation. (in the respective models) for the density functions (see `dWEV` for more information).
`parallel`	"models", "single", "both" or `FALSE`. If `FALSE` no parallelization is used in the fitting process. If "models" the fitting process is parallelized over participants and models (i.e. over the calls for fitting functions). If "single" parallelization is used within the fitting processes (over initial grid search and optimization processes for different start points, but see `fitRTConf`). If "both", parallelization is done hierarchical. For small number of models and participants "single" or "both" is preferable. Otherwise, you may use "models".
`n.cores`	integer vector or `NULL`. If `parallel` is "models" or "single", a single integer for the number of cores used for parallelization is required. If `parallel` is "both", two values are required. The first for the number of parallel model-participant combinations and the second for the parallel processes within the fitting procedures (this may be specified to match the `nAttemps`-Value in the `opts` argument. If `NULL` (default) the number of available cores -1 is used. If `NULL` and `parallel` is "both", the cores will be used for model-participant-parallelization, only.
`...`	Possibility of giving alternative variable names in data frame (in the form `condition = "SOA"`, or `response="pressedKey"`).

Details

The fitting involves a first grid search through an initial grid. Then the best nAttempts parameter sets are chosen for an optimization, which is done with an algorithm, depending on the argument optim-method. The Nelder-Mead algorithm uses the R function optim. The optimization routine is restarted nRestarts times with the starting parameter set equal to the best parameters from the previous routine.

stimulus, response and correct. Two of these columns must be given in data. If all three are given, correct will have no effect (and will be not checked!). stimulus can always be given in numerical format with values -1 and 1. response can always be given as a character vector with "lower" and "upper" as values. Correct must always be given as a 0-1-vector. If stimulus is given together with response and they both do not match the above format, they need to have the same values/levels (if factor). In the case that only stimulus/response is given in any other format together with correct, the unique values will be sorted increasingly and the first value will be encoded as "lower"/-1 and the second as "upper"/+1.

fixed. Parameters that should not be fitted but kept constant. These will be dropped from the initial grid search but will be present in the output, to keep all parameters for prediction in the result. Includes the possibility for symmetric confidence thresholds for both alternative (sym_thetas=logical). Other examples are z =.5, sv=0, st0=0, sz=0. For race models, the possibility of setting a='b' (or vice versa) leads to identical upper bounds on the decision processes, which is the equivalence for z=.5 in a diffusion process

opts. A list with numerical values. Possible options are listed below (together with the optimization method they are used for).

nAttempts (all) number of best performing initial parameter sets used for optimization; default 5
nRestarts (all) number of successive optim routines for each of the starting parameter sets; default 5,
maxfun ('bobyqa') maximum number of function evaluations; default: 5000,
maxit ('Nelder-Mead' and 'L-BFGS-B') maximum iterations; default: 2000,
reltol ('Nelder-Mead') relative tolerance; default: 1e-6),
factr ('L-BFGS-B') tolerance in terms of reduction factor of the objective, default: 1e-10)

Value

Gives data frame with rows for each model-participant combination and columns for the different parameters as fitted result as well as additional information about the fit (negLogLik (for final parameters), k (number of parameters), N (number of data rows), BIC, AICc and AIC)

Author(s)

Sebastian Hellmann, sebastian.hellmann@ku.de

References

Hellmann, S., Zehetleitner, M., & Rausch, M. (2023). Simultaneous modeling of choice, confidence and response time in visual perception. Psychological Review 2023 Mar 13. doi: 10.1037/rev0000411. Epub ahead of print. PMID: 36913292.

Examples

# 1. Generate data from two artificial participants
# Get random drift direction (i.e. stimulus category) and
# stimulus discriminability (two steps: hard, easy)
stimulus <- sample(c(-1, 1), 400, replace=TRUE)
discriminability <- sample(c(1, 2), 400, replace=TRUE)

# generate data for participant 1
data <- rWEV(400, a=2, v=stimulus*discriminability*0.5,
             t0=0.2, z=0.5, sz=0.1, sv=0.1, st0=0,  tau=4, s=1, w=0.3)
# discretize confidence ratings (only 2 steps: unsure vs. sure)
data$rating <- as.numeric(cut(data$conf, breaks = c(-Inf, 1, Inf), include.lowest = TRUE))
data$participant = 1
data$stimulus <- stimulus
data$discriminability <- discriminability
# generate data for participant 2
data2 <- rWEV(400, a=2.5, v=stimulus*discriminability*0.7,
             t0=0.1, z=0.7, sz=0, sv=0.2, st0=0,  tau=2, s=1, w=0.5)
data2$rating <- as.numeric(cut(data$conf, breaks = c(-Inf, 0.3, Inf), include.lowest = TRUE))
data2$participant = 2
data2$stimulus <- stimulus
data2$discriminability <- discriminability

# bind data from participants
data <- rbind(data, data2)
data <- data[data$response!=0, ] # drop not finished decision processes
data <- data[,-3] # drop conf measure (unobservable variable)
head(data)


# 2. Use fitting function
## Not run: 
  # Fitting takes very long to run and uses multiple (6) cores with this
  # call:
  fitRTConfModels(data, models=c("dynWEV", "PCRM"), nRatings = 2,
                logging=FALSE, parallel="both",
                n.cores = c(2,3), # fit two participant-model combination in parallel
                condition="discriminability")# tell which column is "condition"

## End(Not run)

[Package dynConfiR version 0.0.4 Index]