pedmod_opt {pedmod}R Documentation

Optimize the Log Marginal Likelihood

Description

Optimizes eval_pedigree_ll and eval_pedigree_grad using a passed optimization function.

Usage

pedmod_opt(
  ptr,
  par,
  maxvls,
  abs_eps,
  rel_eps,
  opt_func = NULL,
  seed = 1L,
  indices = NULL,
  minvls = -1L,
  do_reorder = TRUE,
  use_aprx = FALSE,
  n_threads = 1L,
  cluster_weights = NULL,
  fix = NULL,
  standardized = FALSE,
  method = 0L,
  use_tilting = FALSE,
  vls_scales = NULL,
  ...
)

pedmod_start(
  ptr,
  data,
  maxvls = 1000L,
  abs_eps = 0,
  rel_eps = 0.01,
  seed = 1L,
  indices = NULL,
  scale_max = 9,
  minvls = 100L,
  do_reorder = TRUE,
  use_aprx = TRUE,
  n_threads = 1L,
  cluster_weights = NULL,
  standardized = FALSE,
  method = 0L,
  sc_start = NULL,
  use_tilting = FALSE,
  vls_scales = NULL
)

pedmod_start_loadings(
  ptr,
  data,
  indices = NULL,
  cluster_weights = NULL,
  sc_start_invariant = NULL
)

Arguments

ptr

object from pedigree_ll_terms or pedigree_ll_terms_loadings.

par

starting values passed to opt_func.

maxvls

maximum number of samples in the approximation for each marginal likelihood term.

abs_eps

absolute convergence threshold for eval_pedigree_ll and eval_pedigree_grad.

rel_eps

rel_eps convergence threshold for eval_pedigree_ll and eval_pedigree_grad.

opt_func

function to perform minimization with arguments like optim. BFGS is used with optim if this argument is NULL.

seed

seed to pass to set.seed before each gradient and function evaluation. Use NULL if the seed should not be fixed.

indices

zero-based vector with indices of which log marginal likelihood terms to include. Use NULL if all indices should be used.

minvls

minimum number of samples for each marginal likelihood term. Negative values provides a default which depends on the dimension of the integration.

do_reorder

TRUE if a heuristic variable reordering should be used. TRUE is likely the best value.

use_aprx

TRUE if a less precise approximation of pnorm and qnorm should be used. This may reduce the computation time while not affecting the result much.

n_threads

number of threads to use.

cluster_weights

numeric vector with weights for each cluster. Use NULL if all clusters have weight one.

fix

integer vector with indices of par to fix. This is useful for computing profile likelihoods. NULL yields all parameters.

standardized

logical for whether to use the standardized or direct parameterization. See standardized_to_direct and the vignette at vignette("pedmod", package = "pedmod").

method

integer with the method to use. Zero yields randomized Korobov lattice rules while one yields scrambled Sobol sequences.

use_tilting

TRUE if the minimax tilting method suggested by Botev (2017) should be used. See doi:10.1111/rssb.12162.

vls_scales

can be a numeric vector with a positive scalar for each cluster. Then vls_scales[i] * minvls and vls_scales[i] * maxvls is used for cluster i rather than minvls and maxvls. Set vls_scales = NULL if the latter should be used.

...

Arguments passed to opt_func.

data

the list that was passed to pedigree_ll_terms or pedigree_ll_terms_loadings.

scale_max

the maximum value for the scale parameters. Sometimes, the optimization method tends to find large scale parameters and get stuck. Setting a maximum solves this.

sc_start

starting value for the scale parameters. Use NULL if you have no value to start with.

sc_start_invariant

scale parameter(s) like sc_start. It is the value that all individuals should have (i.e. not one that varies by individual).

Details

pedmod_start and pedmod_start_loadings yield starting values which can be used for pedmod_opt. The methods are based on a heuristics.

Value

pedmod_opt: The output from the opt_func argument. Thus, if fix is supplied then this is optimal values of only par[-fix] with par[fix] being fixed to the inputs. Thus, the length is only the number of non-fixed parameters.

pedmod_start: A list with:

pedmod_start_loadings: A list with:

See Also

pedmod_sqn.

Examples


# we simulate outcomes with an additive genetic effect. The kinship matrix is
# the same for all families and given by
K <- matrix(c(
  0.5  , 0    , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0    , 0.5  , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0.25 , 0.25 , 0.5  , 0   , 0.25 , 0   , 0.25  , 0.25  , 0.125 , 0.125 ,
  0    , 0    , 0    , 0.5 , 0    , 0   , 0.25  , 0.25  , 0     , 0     ,
  0.25 , 0.25 , 0.25 , 0   , 0.5  , 0   , 0.125 , 0.125 , 0.25  , 0.25  ,
  0    , 0    , 0    , 0   , 0    , 0.5 , 0     , 0     , 0.25  , 0.25  ,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.5   , 0.25  , 0.0625, 0.0625,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.25  , 0.5   , 0.0625, 0.0625,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.5   , 0.25  ,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.25  , 0.5
), 10)

# simulates a data set.
#
# Args:
#   n_fams: number of families.
#   beta: the fixed effect coefficients.
#   sig_sq: the scale parameter.
sim_dat <- function(n_fams, beta = c(-1, 1, 2), sig_sq = 3){
  # setup before the simulations
  Cmat <- 2 * K
  n_obs <- NROW(K)
  Sig <- diag(n_obs) + sig_sq * Cmat
  Sig_chol <- chol(Sig)

  # simulate the data
  out <- replicate(
    n_fams, {
      # simulate covariates
      X <- cbind(`(Intercept)` = 1, Continuous = rnorm(n_obs),
                 Binary = runif(n_obs) > .5)

      # assign the linear predictor + noise
      eta <- drop(X %*% beta) + drop(rnorm(n_obs) %*% Sig_chol)

      # return the list in the format needed for the package
      list(y = as.numeric(eta > 0), X = X, scale_mats = list(Cmat))
    }, simplify = FALSE)

  # add attributes with the true values and return
  attributes(out) <- list(beta = beta, sig_sq = sig_sq)
  out
}

# simulate the data
set.seed(1)
dat <- sim_dat(100L)

# fit the model
ptr <- pedigree_ll_terms(dat, max_threads = 1L)
start <- pedmod_start(ptr = ptr, data = dat, n_threads = 1L)
fit <- pedmod_opt(ptr = ptr, par = start$par, n_threads = 1L, use_aprx = TRUE,
                  maxvls = 5000L, minvls = 1000L, abs_eps = 0, rel_eps = 1e-3)
fit$par # the estimate
-fit$value # the log maximum likelihood
start$logLik_no_rng # the log maximum likelihood without the random effects


[Package pedmod version 0.2.4 Index]