R: Solve the EBNM problem for grouped data

ebnm_group {ebnm}

R Documentation

Solve the EBNM problem for grouped data

Description

Solves the empirical Bayes normal means (EBNM) problem for observations belonging to distinct groups.

Usage

ebnm_group(
  x,
  s = 1,
  group,
  prior_family = "point_normal",
  mode = 0,
  scale = "estimate",
  g_init = NULL,
  fix_g = FALSE,
  output = ebnm_output_default(),
  ...
)

Arguments

`x`	A vector of observations. Missing observations (`NA`s) are not allowed.
`s`	A vector of standard errors (or a scalar if all are equal). Standard errors may not be exactly zero, and missing standard errors are not allowed. Two prior families have additional restrictions: when horseshoe priors are used, errors must be homoskedastic; and since function `deconv` in package `deconvolveR` takes `z`-scores, the "deconvolver" family requires that all standard errors be equal to 1.
`group`	A vector of character strings that gives the group to which each observation belongs. It must have the same length as argument `x`. For an example of usage, see Examples below.
`prior_family`	A named vector that specifies the prior family `G` for each group. If the same prior family is to be used for all groups, then a character string may be used instead.
`mode`	A named list that specifies, for each group, the mode of the respective prior `g`, or `"estimate"` if the mode is to be estimated from the data. If the mode is the same across groups, then a scalar may be used instead. If all modes are to be estimated, then `mode = "estimate"` may be used.
`scale`	A named list that specifies, for each group, the scale parameter(s) of the respective prior, or `"estimate"` if the scale parameters are to be estimated from the data. If the scale parameter is the same across groups, then a scalar may be used instead. If all scales are to be estimated, then `scale = "estimate"` may be used.
`g_init`	The prior distributions `g`. Usually this is left unspecified (`NULL`) and estimated from the data. However, it can be used in conjuction with `fix_g = TRUE` to fix the prior (useful, for example, to do computations with the "true" `g` in simulations). If `g_init` is specified but `fix_g = FALSE`, `g_init` specifies the initial value of `g` used during optimization. If `g_init` is supplied, it should be a named list that specifies, for each group, a prior of the appropriate class (`normalmix` for normal, point-normal, scale mixture of normals, and `deconvolveR` prior families, as well as for the NPMLE; class `laplacemix` for point-Laplace families; class `gammamix` for point-exponential families; class `horseshoe` for horseshoe families; and class `unimix` for `unimodal_` families).
`fix_g`	If `TRUE`, fix the prior `g` at `g_init` instead of estimating it.
`output`	A character vector indicating which values are to be returned. Function `ebnm_output_default()` provides the default return values, while `ebnm_output_all()` lists all possible return values. See Value below.
`...`	Additional parameters. When a `unimodal_` prior family is used, these parameters are passed to function `ash` in package `ashr`. Although it does not call into `ashr`, the scale mixture of normals family accepts parameter `gridmult` for purposes of comparison. When `gridmult` is set, an `ashr`-style grid will be used instead of the default `ebnm` grid. When the "deconvolver" family is used, additional parameters are passed to function `deconv` in package `deconvolveR`. Families of generalized binary priors take several additional parameters; see `ebnm_generalized_binary`. In all other cases, additional parameters are ignored.

Details

The EBNM model for grouped data, with observations x_j belonging to groups k = 1, ..., K, is

x_j | \theta_j, s_j \sim N(\theta_j, s_j^2)

\theta_j \sim g_{k(j)} \in G_{k(j)}.

Solving the EBNM problem for grouped data is equivalent to solving a separate EBNM problem for each group k = 1, ..., K, with the optimal log likelihood equal to the sum of the optimal log likelihoods for each separate problem.

Value

An ebnm object. Depending on the argument to output, the object is a list containing elements:

data: A data frame containing the observations x and standard errors s.
posterior: A data frame of summary results (posterior means, standard deviations, second moments, and local false sign rates).
fitted_g: The fitted prior \hat{g} (an object of class normalmix, laplacemix, gammamix, unimix, tnormalmix, or horseshoe).
log_likelihood: The optimal log likelihood attained, L(\hat{g}).
posterior_sampler: A function that can be used to produce samples from the posterior. For all prior families other than the horseshoe, the sampler takes a single parameter nsamp, the number of posterior samples to return per observation. Since ebnm_horseshoe returns an MCMC sampler, it additionally takes parameter burn, the number of burn-in samples to discard.

S3 methods coef, confint, fitted, logLik, nobs, plot, predict, print, quantile, residuals, simulate, summary, and vcov have been implemented for ebnm objects. For details, see the respective help pages, linked below under See Also.

Examples

group <- c(rep("small_sd", 100), rep("large_sd", 100))
theta <- c(rnorm(100, sd = 1), rnorm(100, sd = 10))
s <- 1
x <- theta + rnorm(200, 0, s)

ebnm.group.res <- ebnm_group(x, s, group)

# Use different prior families for each group:
ebnm.group.res <- ebnm_group(
  x, s, group,
  prior_family = list(small_sd = "normal", large_sd = "normal_scale_mixture")
)

# Different modes and scales can be set similarly:
ebnm.group.res <- ebnm_group(
  x, s, group,
  mode = list(small_sd = 0, large_sd = "estimate"),
  scale = list(small_sd = 1, large_sd = "estimate")
)

[Package ebnm version 1.1-2 Index]