R: Adaptive P-value Thresholding with Generalized Additive...

adapt_gam {adaptMT}

R Documentation

Adaptive P-value Thresholding with Generalized Additive Models

Description

adapt_gam is a wrapper of adapt that fits pi(x) and mu(x) by gam from mgcv package.

Usage

adapt_gam(x, pvals, pi_formulas, mu_formulas, piargs = list(),
  muargs = list(), dist = beta_family(), s0 = rep(0.45, length(pvals)),
  alphas = seq(0.01, 1, 0.01), ...)

Arguments

`x`	covariates (i.e. side-information). Should be compatible to `models`. See Details
`pvals`	a vector of values in [0, 1]. P-values
`pi_formulas`	a vector/list of strings/formulas. Formulas for fitting pi(x) by gam. See Details
`mu_formulas`	a vector/list of strings/formulas. Formulas for fitting mu(x) by gam. See Details
`piargs`	a list. Other arguments passed to gam for fitting pi(x)
`muargs`	a list. Other arguments passed to gam for fitting mu(x)
`dist`	an object of class "`gen_exp_family`". `beta_family()` as default
`s0`	a vector of values in [0, 0.5). Initial threshold.
`alphas`	a vector of values in (0, 1). Target FDR levels.
`...`	other arguments passed to `adapt` (except `models`)

Details

pi_formulas and mu_formulas can either be a list or a vector with each element being a string or a formula. For instance, suppose x has a single column with name x1, the following five options are valid for the same inputs (ns forms a spline basis with df knots and s forms a spline basis with knots automatically selected by generalized cross-validation):

c("x1", "ns(x1, df = 8)", "s(x1)");
c("~ x1", "~ ns(x1, df = 8)", "s(x1)");
list("x1", "ns(x1, df = 8)", "s(x1)");
list("~ x1", "~ ns(x1, df = 8)", "s(x1)");
list(~ x1, ~ ns(x1, df = 8), s(x1))

There is no need to specify the name of the response variable, as this is handled in the function.

When x has a few variables, it is common to use non-parametric GLM by replacing x by a spline basis of x. In this case, ns from library(splines) package or s from mgcv package are suggested. When s (from mgcv package) is used, it is treated as a single model because the knots will be selected automatically.

Examples


# Generate a 2-dim x
n <- 400
x1 <- x2 <- seq(-100, 100, length.out = 20)
x <- expand.grid(x1, x2)
colnames(x) <- c("x1", "x2")

# Generate p-values (one-sided z test)
# Set all hypotheses in the central circle with radius 30 to be
# non-nulls. For non-nulls, z~N(2,1) and for nulls, z~N(0,1).
H0 <- apply(x, 1, function(coord){sum(coord^2) < 900})
mu <- ifelse(H0, 2, 0)
set.seed(0)
zvals <- rnorm(n) + mu
pvals <- 1 - pnorm(zvals)

# Run adapt_gam with a 2d spline basis
library("mgcv")
formula <- "s(x1, x2)"
dist <- beta_family()
res <- adapt_gam(x = x, pvals = pvals, pi_formulas = formula,
                 mu_formulas = formula, dist = dist, nfits = 5)