MaxControl-class {maxLik} | R Documentation |
Class "MaxControl"
Description
This is the structure that holds the optimization control options.
The corresponding constructors take
the parameters, perform consistency checks, and return the
control structure. Alternatively, it overwrites the supplied
parameters in an existing MaxControl
structure. There is also
a method to extract the control structure from the estimated
‘maxim’-objects.
Slots
The default values and definition of the slots:
- tol
1e-8, stopping condition for
maxNR
and related optimizers. Stop if the absolute difference between successive iterations is less thantol
, returns code 2.- reltol
sqrt(.Machine$double.eps), relative convergence tolerance (used by
maxNR
related optimizers, andoptim
-based optimizers. The algorithm stops if it iteration increases the value by less than a factor ofreltol*(abs(val) + reltol)
. Returns code 2.- gradtol
1e-6, stopping condition for
maxNR
and related optimizers. Stops if norm of the gradient is less thangradtol
, returns code 1.- steptol
1e-10, stopping/error condition for
maxNR
and related optimizers. Ifqac == "stephalving"
and the quadratic approximation leads to a worse, instead of a better value, or toNA
, the step length is halved and a new attempt is made. If necessary, this procedure is repeated untilstep < steptol
, thereafter code 3 is returned.- lambdatol
1e-6, (for
maxNR
related optimizers) controls whether Hessian is treated as negative definite. If the largest of the eigenvalues of the Hessian is larger than-lambdatol
(Hessian is not negative definite), a suitable diagonal matrix is subtracted from the Hessian (quadratic hill-climbing) in order to enforce negative definiteness.- qac
"stephalving", character, Qadratic Approximation Correction for
maxNR
related optimizers. When the new guess is worse than the initial one, program attempts to correct it:"stephalving"
decreases the step but keeps the direction."marquardt"
uses Marquardt (1963) method by decreasing the step length while also moving closer to the pure gradient direction. It may be faster and more robust choice in areas where quadratic approximation behaves poorly.- qrtol
1e-10, QR-decomposition tolerance for Hessian inversion in
maxNR
related optimizers.- marquardt_lambda0
0.01, a positive numeric, initial correction term for Marquardt (1963) correction in
maxNR
-related optimizers- marquardt_lambdaStep
2, how much the Marquardt (1963) correction is decreased/increased at successful/unsuccesful step for
maxNR
related optimizers- marquardt_maxLambda
1e12, maximum allowed correction term for
maxNR
related optimizers. If exceeded, the algorithm exits with return code 3.- nm_alpha
1, Nelder-Mead simplex method reflection factor (see Nelder & Mead, 1965)
- nm_beta
0.5, Nelder-Mead contraction factor
- nm_gamma
2, Nelder-Mead expansion factor
- sann_cand
NULL
or a function for"SANN"
algorithm to generate a new candidate point; ifNULL
, Gaussian Markov kernel is used (see argumentgr
ofoptim
).- sann_temp
10, starting temperature for the “SANN” cooling schedule. See
optim
.- sann_tmax
10, number of function evaluations at each temperature for the “SANN” optimizer. See
optim
.- sann_randomSeed
123, integer to seed random numbers to ensure replicability of “SANN” optimization and preserve
R
random numbers. Use options likeSANN_randomSeed=Sys.time()
orSANN_randomeSeed=sample(1000,1)
if you want stochastic results.
General options for stochastic gradient methods:
- SG_learningRate
0.1, learning rate, numeric
- SG_batchSize
NULL
, batch size for Stochastic Gradient Ascent. A positive integer, orNULL
for full-batch gradent ascent.- SG_clip
NULL
, gradient clipping threshold. This is the max allowed squared Euclidean norm of the gradient. If the actual norm of the gradient exceeds (square root of) this threshold, the gradient will be scaled back accordingly while preserving its direction.NULL
means no clipping.- SG_patience
NULL
, or integer. Stopping condition: if the objective function is worse than its largest value so far this many times, the algorithm stops, and returns not the last parameter value but the one that gave the best results so far. This is mostly useful if gradient is computed on training data and the objective function on validation data.- SG_patienceStep
1L, integer. After how many epochs to check the patience value. 1 means to check (and hence to compute the objective function) at each epoch.
Options for SGA:
- SGA_momentum
0, numeric momentum parameter for SGA. Must lie in interval
[0,1]
.
Options for Adam:
- Adam_momentum1
0.9, numeric in
[0,1]
, the first moment momentum- Adam_momentum2
0.999, numeric in
[0,1]
, the second moment momentum
General options:
- iterlim
150, stopping condition (the default differs for different methods). Stop if more than
iterlim
iterations performed. Note that ‘iteration’ may mean different things for different optimizers.- max.rows
20, maximum number of matrix rows to be printed when requesting verbosity in the optimizers.
- max.cols
7, maximum number of columns to be printed. This also applies to vectors that are printed horizontally.
- printLevel
0, the level of verbosity. Larger values print more information. Result depends on the optimizer. Form
print.level
is also accepted by the methods for compatibility.- storeParameters
FALSE
, whether to store and return the parameter values at each epoch. IfTRUE
, the stored values can be retrieved withstoredParameters
-method. The parameters are stored as a matrix with rows corresponding to the epochs and columns to the parameter components.- storeValues
FALSE
, whether to store and return the objective function values at each epoch. IfTRUE
, the stored values can be retrieved withstoredValues
-method.
Methods
- maxControl
(...)
creates a “MaxControl” object. The arguments must be in the formoption1 = value1, option2 = value2, ...
. The options should be slot names, but the method also supports selected other parameter forms for compatibility reasons e.g. “print.level” instead of “printLevel”. In case there are more than one option with similar name, the last one overwrites the previous values. This allows the user to override default parameters in the control list. See example in maxLik-package.- maxControl
(x = "MaxControl", ...)
overwrites parameters of an existing “MaxControl” object. The ‘...’ argument must be in the formoption1 = value1, option2 = value2, ...
. In case there are more than one option with similar name, only the last one is taken into account. This allows the user to override default parameters in the control list. See example in maxLik-package.- maxControl
(x = "maxim")
extracts “MaxControl” structure from an estimated model- show
shows the parameter values
Details
Typically, the control options are supplied in the form of a list, in which
case the corresponding default values are overwritten by the
user-specified ones. However, one may also create the control
structure by maxControl(opt1=value1, opt2=value2, ...)
and
supply such value directly to the optimizer. In this case the
optimization routine takes all the values from the control object.
Note
Several control parameters can also be supplied directly to the optimization routines.
Author(s)
Ott Toomet
References
Nelder, J. A. & Mead, R. A (1965) Simplex Method for Function Minimization The Computer Journal 7, 308–313
Marquardt, D. W. (1963) An Algorithm for Least-Squares Estimation of Nonlinear Parameters Journal of the Society for Industrial and Applied Mathematics 11, 431–441
Examples
library(maxLik)
## Create a 'maxControl' object:
maxControl(tol=1e-4, sann_tmax=7, printLevel=2)
## Optimize quadratic form t(D) %*% W %*% D with p.d. weight matrix,
## s.t. constraints sum(D) = 1
quadForm <- function(D) {
return(-t(D) %*% W %*% D)
}
eps <- 0.1
W <- diag(3) + matrix(runif(9), 3, 3)*eps
D <- rep(1/3, 3)
# initial values
## create control object and use it for optimization
co <- maxControl(printLevel=2, qac="marquardt", marquardt_lambda0=1)
res <- maxNR(quadForm, start=D, control=co)
print(summary(res))
## Now perform the same with no trace information
co <- maxControl(co, printLevel=0)
res <- maxNR(quadForm, start=D, control=co) # no tracing information
print(summary(res)) # should be the same as above
maxControl(res) # shows the control structure