R: Adaptive P-value Thresholding with Generalized Linear Models

adapt_glm {adaptMT}

R Documentation

Adaptive P-value Thresholding with Generalized Linear Models

Description

adapt_glm is a wrapper of adapt that fits pi(x) and mu(x) by glm.

Usage

adapt_glm(x, pvals, pi_formulas, mu_formulas, dist = beta_family(),
  s0 = rep(0.45, length(pvals)), alphas = seq(0.01, 1, 0.01),
  piargs = list(), muargs = list(), ...)

Arguments

`x`	covariates (i.e. side-information). Should be compatible to `models`. See Details
`pvals`	a vector of values in [0, 1]. P-values
`pi_formulas`	a vector/list of strings/formulas. Formulas for fitting pi(x) by glm. See Details
`mu_formulas`	a vector/list of strings/formulas. Formulas for fitting mu(x) by glm. See Details
`dist`	an object of class "`gen_exp_family`". `beta_family()` as default
`s0`	a vector of values in [0, 0.5). Initial threshold.
`alphas`	a vector of values in (0, 1). Target FDR levels.
`piargs`	a list. Other arguments passed to glm for fitting pi(x)
`muargs`	a list. Other arguments passed to glm for fitting mu(x)
`...`	other arguments passed to `adapt` (except `models`)

Details

pi_formulas and mu_formulas can either be a list or a vector with each element being a string or a formula. For instance, suppose x has a single column with name x1, the following five options are valid for the same inputs (ns forms a spline basis with df knots):

c("x1", "ns(x1, df = 8)");
c("~ x1", "~ ns(x1, df = 8)");
list("x1", "ns(x1, df = 8)");
list("~ x1", "~ ns(x1, df = 8)");
list(~ x1, ~ ns(x1, df = 8))

There is no need to specify the name of the response variable, as this is handled in the function.

When x has a few variables, it is common to use non-parametric GLM by replacing x by a spline basis of x. In this case, ns from library(splines) package is suggested.

Examples


# Load estrogen data
data(estrogen)
pvals <- as.numeric(estrogen$pvals)
x <- data.frame(x = as.numeric(estrogen$ord_high))
dist <- beta_family()

# Subsample the data for convenience
inds <- (x$x <= 5000)
pvals <- pvals[inds]
x <- x[inds,,drop = FALSE]

# Run adapt_glm
library("splines")
formulas <- paste0("ns(x, df = ", 6:10, ")")
res <- adapt_glm(x = x, pvals = pvals, pi_formulas = formulas,
                 mu_formulas = formulas, dist = dist, nfits = 10)

# Run adapt by manually setting models for glm
models <- lapply(formulas, function(formula){
    piargs <- muargs <- list(formula = formula)
    gen_adapt_model(name = "glm", piargs = piargs, muargs = muargs)
})
res2 <- adapt(x = x, pvals = pvals, models = models,
              dist = dist, nfits = 10)

# Check equivalence
identical(res, res2)