pedmod_sqn {pedmod} | R Documentation |
Optimize the Log Marginal Likelihood Using a Stochastic Quasi-Newton Method
Description
Optimizes eval_pedigree_ll
and eval_pedigree_grad
using a stochastic quasi-Newton method.
Usage
pedmod_sqn(
ptr,
par,
maxvls,
abs_eps,
rel_eps,
step_factor,
n_it,
n_grad_steps,
indices = NULL,
minvls = -1L,
n_grad = 50L,
n_hess = 500L,
do_reorder = TRUE,
use_aprx = FALSE,
n_threads = 1L,
cluster_weights = NULL,
fix = NULL,
standardized = FALSE,
minvls_hess = minvls,
maxvls_hess = maxvls,
abs_eps_hess = abs_eps,
rel_eps_hess = rel_eps,
verbose = FALSE,
method = 0L,
check_every = 2L * n_grad_steps,
use_tilting = FALSE,
vls_scales = NULL
)
Arguments
ptr |
object from |
par |
starting values. |
maxvls |
maximum number of samples in the approximation for each marginal likelihood term. |
abs_eps |
absolute convergence threshold for
|
rel_eps |
rel_eps convergence threshold for
|
step_factor |
factor used for the step size. The step size is
|
n_it |
number of stochastic gradient steps to make. |
n_grad_steps |
number of stochastic gradient steps to make between each Hessian approximation update. |
indices |
zero-based vector with indices of which log marginal
likelihood terms to include. Use |
minvls |
minimum number of samples for each marginal likelihood term. Negative values provides a default which depends on the dimension of the integration. |
n_grad |
number of log marginal likelihood terms to include in the stochastic gradient step. |
n_hess |
number of log marginal likelihood terms to include in the
gradients used for the Hessian approximation update. This is set to the
entire sample (or |
do_reorder |
|
use_aprx |
|
n_threads |
number of threads to use. |
cluster_weights |
numeric vector with weights for each cluster. Use
|
fix |
integer vector with indices of |
standardized |
logical for whether to use the standardized or direct
parameterization. See |
minvls_hess |
|
maxvls_hess |
|
abs_eps_hess |
|
rel_eps_hess |
|
verbose |
logical for whether to print output during the estimation. |
method |
integer with the method to use. Zero yields randomized Korobov lattice rules while one yields scrambled Sobol sequences. |
check_every |
integer for the number of gradient steps between checking that the likelihood did increase. If not, the iterations are reset and the step-size is halved. |
use_tilting |
|
vls_scales |
can be a numeric vector with a positive scalar for each
cluster. Then |
Details
The function uses a stochastic quasi-Newton method like suggested by Byrd et al. (2016) with a few differences: Differences in gradients are used rather than Hessian-vector products, BFGS rather than L-BFGS is used because the problem is typically low dimensional, and damped BFGS updates are used (see e.g. chapter 18 of Nocedal and Wright, 2006).
Separate arguments for the gradient approximation in the Hessian update are
provided as one may want a more precise approximation for these gradients.
step_factor
likely depends on the other parameters and the data set
and should be altered.
Value
A list with the following elements:
par |
estimated parameters. |
omegas |
parameter estimates after each iteration. |
H |
Hessian approximation in the quasi-Newton method. It should not be treated as the Hessian. |
References
Byrd, R. H., Hansen, S. L., Nocedal, J., & Singer, Y. (2016). A stochastic quasi-Newton method for large-scale optimization. SIAM Journal on Optimization, 26(2), 1008-1031.
Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer Science & Business Media.
See Also
pedmod_opt
and pedmod_start
.
Examples
# we simulate outcomes with an additive genetic effect. The kinship matrix is
# the same for all families and given by
K <- matrix(c(
0.5 , 0 , 0.25 , 0 , 0.25 , 0 , 0.125 , 0.125 , 0.125 , 0.125 ,
0 , 0.5 , 0.25 , 0 , 0.25 , 0 , 0.125 , 0.125 , 0.125 , 0.125 ,
0.25 , 0.25 , 0.5 , 0 , 0.25 , 0 , 0.25 , 0.25 , 0.125 , 0.125 ,
0 , 0 , 0 , 0.5 , 0 , 0 , 0.25 , 0.25 , 0 , 0 ,
0.25 , 0.25 , 0.25 , 0 , 0.5 , 0 , 0.125 , 0.125 , 0.25 , 0.25 ,
0 , 0 , 0 , 0 , 0 , 0.5 , 0 , 0 , 0.25 , 0.25 ,
0.125, 0.125, 0.25 , 0.25, 0.125, 0 , 0.5 , 0.25 , 0.0625, 0.0625,
0.125, 0.125, 0.25 , 0.25, 0.125, 0 , 0.25 , 0.5 , 0.0625, 0.0625,
0.125, 0.125, 0.125, 0 , 0.25 , 0.25, 0.0625, 0.0625, 0.5 , 0.25 ,
0.125, 0.125, 0.125, 0 , 0.25 , 0.25, 0.0625, 0.0625, 0.25 , 0.5
), 10)
# simulates a data set.
#
# Args:
# n_fams: number of families.
# beta: the fixed effect coefficients.
# sig_sq: the scale parameter.
sim_dat <- function(n_fams, beta = c(-1, 1, 2), sig_sq = 3){
# setup before the simulations
Cmat <- 2 * K
n_obs <- NROW(K)
Sig <- diag(n_obs) + sig_sq * Cmat
Sig_chol <- chol(Sig)
# simulate the data
out <- replicate(
n_fams, {
# simulate covariates
X <- cbind(`(Intercept)` = 1, Continuous = rnorm(n_obs),
Binary = runif(n_obs) > .5)
# assign the linear predictor + noise
eta <- drop(X %*% beta) + drop(rnorm(n_obs) %*% Sig_chol)
# return the list in the format needed for the package
list(y = as.numeric(eta > 0), X = X, scale_mats = list(Cmat))
}, simplify = FALSE)
# add attributes with the true values and return
attributes(out) <- list(beta = beta, sig_sq = sig_sq)
out
}
# simulate the data
set.seed(1)
dat <- sim_dat(100L)
# fit the model
ptr <- pedigree_ll_terms(dat, max_threads = 1L)
start <- pedmod_start(ptr = ptr, data = dat, n_threads = 1L)
fit <- pedmod_sqn(ptr = ptr, par = start$par, n_threads = 1L, use_aprx = TRUE,
maxvls = 5000L, minvls = 1000L, abs_eps = 0, rel_eps = 1e-3,
n_grad_steps = 20L, step_factor = 1, n_grad = 10L,
n_hess = 50L, check_every = 50L, n_it = 1000L)
fit$par # maximum likelihood estimate
# the maximum likelihood
eval_pedigree_ll(ptr = ptr, fit$par, maxvls = 5000L, abs_eps = 0,
rel_eps = 1e-3, minvls = 1000L)