R: Confidence set for functions of model parameters

fct_confint {EpiForsk}

R Documentation

Confidence set for functions of model parameters

Description

Computes confidence sets of functions of model parameters by computing a confidence set of the model parameters and returning the codomain of the provided function given the confidence set of model parameters as domain.

Usage

fct_confint(
  object,
  f,
  which_parm = rep(TRUE, length(coef(object))),
  level = 0.95,
  ...
)

## S3 method for class 'lm'
fct_confint(
  object,
  f,
  which_parm = rep(TRUE, length(coef(object))),
  level = 0.95,
  return_beta = FALSE,
  n_grid = NULL,
  k = NULL,
  len = 0.1,
  parallel = c("sequential", "multisession", "multicore", "cluster"),
  n_cores = 10L,
  ...
)

## S3 method for class 'glm'
fct_confint(
  object,
  f,
  which_parm = rep(TRUE, length(coef(object))),
  level = 0.95,
  return_beta = FALSE,
  n_grid = NULL,
  k = NULL,
  len = 0.1,
  parallel = c("sequential", "multisession", "multicore", "cluster"),
  n_cores = 10L,
  ...
)

## S3 method for class 'lms'
fct_confint(
  object,
  f,
  which_parm = rep(TRUE, length(coef(object))),
  level = 0.95,
  return_beta = FALSE,
  len = 0.1,
  n_grid = 0L,
  k = 1000L,
  parallel = c("sequential", "multisession", "multicore", "cluster"),
  n_cores = 10,
  ...
)

## Default S3 method:
fct_confint(
  object,
  f,
  which_parm = rep(TRUE, length(coef(object))),
  level = 0.95,
  ...
)

Arguments

`object`	A fitted model object.
`f`	A function taking the parameter vector as its single argument, and returning a numeric vector.
`which_parm`	Either a logical vector the same length as the coefficient vector, with `TRUE` indicating a coefficient is used by `f`, or an integer vector with the indices of the coefficients used by `f`.
`level`	The confidence level required.
`...`	Additional argument(s) passed to methods.
`return_beta`	Logical, if `TRUE` returns both the confidence limits and the parameter values used from the boundary of the parameter confidence set.
`n_grid`	Either `NULL` or an integer vector of length 1 or the number of `TRUE`/indices in which_parm. Specifies the number of grid points in each dimension of a grid with endpoints defined by len. If `NULL` or `0L`, will instead sample k points uniformly on a sphere.
`k`	If n_grid is `NULL` or `0L`, the number of points to sample uniformly from a sphere.
`len`	numeric, the radius of the sphere or box used to define directions in which to look for boundary points of the parameter confidence set.
`parallel`	Character, specify how futures are resolved. Default is "sequential". Can be "multisession" to resolve in parallel in separate R sessions, "multicore" (not supported on Windows) to resolve in parallel in forked R processes, or "cluster" to resolve in parallel in separate R sessions running on one or more machines.
`n_cores`	An integer specifying the number of threads to use for parallel computing.

Details

Assume the response Y and predictors X are given by a generalized linear model, that is, they fulfill the assumptions

E(Y|X)=\mu(X^T\beta)

V(Y|X)=\psi \nu(\mu(X^T\beta))

Y|X\sim\varepsilon(\theta,\nu_{\psi}).

Here \mu is the mean value function, \nu is the variance function, and \psi is the dispersion parameter in the exponential dispersion model \varepsilon(\theta,\nu_{\psi}), where \theta is the canonical parameter and \nu_{\psi} is the structure measure. Then it follows from the central limit theorem that

\hat\beta\sim N(\beta, (X^TWX)^{-1})

will be a good approximation in large samples, where X^TWX is the Fisher information of the exponential dispersion model.

From this, the combinant

(\hat\beta-\beta)^TX^TWX(\hat\beta-\beta)

is an approximate pivot, with a \chi_p^2 distribution. Then

C_{\beta}=\{\beta|(\hat\beta-\beta)^TX^TWX(\hat\beta-\beta)<\chi_p^2(1-\alpha)\}

is an approximate (1-\alpha)-confidence set for the parameter vector \beta. Similarly, confidence sets for sub-vectors of \beta can be obtained by the fact that marginal distributions of normal distributions are again normally distributed, where the mean vector and covariance matrix are appropriate subvectors and submatrices.

Finally, a confidence set for the transformed parameters f(\beta) is obtained as

\{f(\beta)|\beta\in C_{\beta}\}

Note this is a conservative confidence set, since parameters outside the confidence set of \beta can be mapped to the confidence set of the transformed parameter.

To determine C_{\beta}, fct_confint() uses a convex optimization program when f is follows DCP rules. Otherwise, it finds the boundary by taking a number of points around \hat\beta and projecting them onto the boundary. In this case, the confidence set of the transformed parameter will only be valid if the boundary of C_{\beta} is mapped to the boundary of the confidence set for the transformed parameter.

The points projected to the boundary are either laid out in a grid around \hat\beta, with the number of points in each direction determined by n_grid, or uniformly at random on a hypersphere, with the number of points determined by k. The radius of the grid/sphere is determined by len.

To print a progress bar with information about the fitting process, wrap the call to fct_confint in with_progress, i.e. progressr::with_progress({result <- fct_confint(object, f)})

Value

A tibble with columns estimate, conf.low, and conf.high or if return_beta is TRUE, a list with the tibble and the beta values on the boundary used to calculate the confidence limits.

Author(s)

KIJA

Examples

data <- 1:5 |>
  purrr::map(
    \(x) {
      name = paste0("cov", x);
      dplyr::tibble("{name}" := rnorm(100, 1))
    }
  ) |>
  purrr::list_cbind() |>
  dplyr::mutate(
  y = rowSums(dplyr::across(dplyr::everything())) + rnorm(100)
  )
lm <- lm(
 as.formula(
  paste0("y ~ 0 + ", paste0(names(data)[names(data) != "y"], collapse = " + "))
 ),
 data
)
fct_confint(lm, sum)
fct_confint(lm, sum, which_parm = 1:3, level = 0.5)

[Package EpiForsk version 0.1.1 Index]