ash {ashr}  R Documentation 
Adaptive Shrinkage
Description
Implements Empirical Bayes shrinkage and false discovery rate methods based on unimodal prior distributions.
Usage
ash(
betahat,
sebetahat,
mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "uniform",
"halfnormal"),
df = NULL,
...
)
ash.workhorse(
betahat,
sebetahat,
method = c("fdr", "shrink"),
mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "uniform",
"halfnormal"),
optmethod = c("mixSQP", "mixIP", "cxxMixSquarem", "mixEM", "mixVBEM", "w_mixEM"),
df = NULL,
nullweight = 10,
pointmass = TRUE,
prior = c("nullbiased", "uniform", "unit"),
mixsd = NULL,
gridmult = sqrt(2),
outputlevel = 2,
g = NULL,
fixg = FALSE,
mode = 0,
alpha = 0,
grange = c(Inf, Inf),
control = list(),
lik = NULL,
weights = NULL,
pi_thresh = 1e10
)
Arguments
betahat 
a p vector of estimates 
sebetahat 
a p vector of corresponding standard errors 
mixcompdist 
distribution of components in mixture used to represent the family G.
Depending on the choice of mixture component, the family G becomes more or less flexible.
Options are:
If you are happy to assume a symmetric distribution for effects, you can use "uniform" or "normal". If you believe your effects may be asymmetric, use "halfuniform" or "halfnormal". If you want to allow only positive/negative effects use "+uniform"/"uniform". The use of "normal" and "halfnormal" is permitted only if df=NULL. 
df 
appropriate degrees of freedom for (t) distribution of (betahatbeta)/sebetahat; default is NULL which is actually treated as infinity (Gaussian) 
... 
Further arguments of function 
method 
specifies how ash is to be run. Can be "shrinkage" (if main aim is shrinkage) or "fdr" (if main aim is to assess false discovery rate or false sign rate (fsr)). This is simply a convenient way to specify certain combinations of parameters: "shrinkage" sets pointmass=FALSE and prior="uniform"; "fdr" sets pointmass=TRUE and prior="nullbiased". 
optmethod 
specifies the function implementing an optimization method. 
nullweight 
scalar, the weight put on the prior under
"nullbiased" specification, see 
pointmass 
Logical, indicating whether to use a point mass at zero as one of components for a mixture distribution. 
prior 
string, or numeric vector indicating Dirichlet prior
on mixture proportions: “nullbiased”,

mixsd 
Vector of standard deviations for underlying mixture components. 
gridmult 
the multiplier by which the default grid values for mixsd differ by one another. (Smaller values produce finer grids.) 
outputlevel 
Determines amount of output. There are several numeric options: 0 = just fitted g; 1 = also PosteriorMean and PosteriorSD; 2 = everything usually needed; 3 = also include results of mixture fitting procedure (including matrix of loglikelihoods used to fit mixture). 4 and 5 are reserved for outputting additional data required by the (indevelopment) flashr package. The user can also specify the output they require in detail (see Examples). 
g 
The prior distribution for beta. Usually this is unspecified (NULL) and estimated from the data. However, it can be used in conjuction with fixg=TRUE to specify the g to use (e.g. useful in simulations to do computations with the "true" g). Or, if g is specified but fixg=FALSE, the g specifies the initial value of g used before optimization, (which also implicitly specifies mixcompdist). 
fixg 
If TRUE, don't estimate g but use the specified g  useful for computations under the "true" g in simulations. 
mode 
either numeric (indicating mode of g) or string "estimate", to indicate mode should be estimated, or a two dimension numeric vector to indicate the interval to be searched for the mode. 
alpha 
Numeric value of alpha parameter in the model. 
grange 
Two dimension numeric vector indicating the left and right limit of g. Default is c(Inf, Inf). 
control 
A list of control parameters passed to optmethod. 
lik 
Contains details of the likelihood used; for general ash. Currently, the following choices are allowed: normal (see function lik_normal(); binomial likelihood (see function lik_binom); likelihood based on logF error distribution (see function lik_logF); mixture of normals likelihood (see function lik_normalmix); and Poisson likelihood (see function lik_pois). 
weights 
a vector of weights for observations; use with optmethod = "w_mixEM"; this is currently betafunctionality. 
pi_thresh 
a threshold below which to prune out mixture components before computing summaries (speeds up computation since empirically many components are usually assigned negligible weight). The current implementation still returns the full fitted distribution; this only affects the posterior summaries. 
Details
The ash function provides a number of ways to perform Empirical Bayes shrinkage estimation and false discovery rate estimation. The main assumption is that the underlying distribution of effects is unimodal. Novice users are recommended to start with the examples provided below.
In the simplest case the inputs to ash are a vector of estimates (betahat) and their corresponding standard errors (sebetahat), and degrees of freedom (df). The method assumes that for some (unknown) "true" vector of effects beta, the statistic (betahat[j]beta[j])/sebetahat[j] has a $t$ distribution on $df$ degrees of freedom. (The default of df=NULL assumes a normal distribution instead of a t.)
By default the method estimates the vector beta under the assumption that beta ~ g for a distribution
g in G, where G is some unimodal family of distributions to be specified (see parameter mixcompdist
).
By default is to assume the mode is 0, and this is suitable for settings where you are interested in testing which beta[j]
are nonzero. To estimate the mode see parameter mode
.
As is standard in empirical Bayes methods, the fitting proceeds in two stages: i) estimate g by maximizing a (possibly penalized) likelihood; ii) compute the posterior distribution for each beta[j]  betahat[j],sebetahat[j] using the estimated g as the prior distribution.
A more general case allows that beta[j]/sebetahat[j]^alpha  sebetahat[j] ~ g.
Value
ash returns an object of class
"ash", a
list with some or all of the following elements (determined by
outputlevel)
fitted_g 
fitted mixture 
loglik 
log P(Dfitted_g) 
logLR 
log[P(Dfitted_g)/P(Dbeta==0)] 
result 
A dataframe whose columns are: 
 NegativeProb
A vector of posterior probability that beta is negative.
 PositiveProb
A vector of posterior probability that beta is positive.
 lfsr
A vector of estimated local false sign rate.
 lfdr
A vector of estimated local false discovery rate.
 qvalue
A vector of q values.
 svalue
A vector of s values.
 PosteriorMean
A vector consisting the posterior mean of beta from the mixture.
 PosteriorSD
A vector consisting the corresponding posterior standard deviation.
call 
a call in which all of the specified arguments are specified by their full names 
data 
a list containing details of the data and models used (mostly for internal use) 
fit_details 
a list containing results of mixture optimization, and matrix of component loglikelihoods used in this optimization 
Functions

ash.workhorse
: Adaptive Shrinkage with full set of options.
See Also
ashci
for computation of credible intervals
after getting the ash object return by ash()
Examples
beta = c(rep(0,100),rnorm(100))
sebetahat = abs(rnorm(200,0,1))
betahat = rnorm(200,beta,sebetahat)
beta.ash = ash(betahat, sebetahat)
names(beta.ash)
head(beta.ash$result) # the main dataframe of results
head(get_pm(beta.ash)) # get_pm returns posterior mean
head(get_lfsr(beta.ash)) # get_lfsr returns the local false sign rate
graphics::plot(betahat,get_pm(beta.ash),xlim=c(4,4),ylim=c(4,4))
## Not run:
# Why is this example included here? Peter
CIMatrix=ashci(beta.ash,level=0.95)
print(CIMatrix)
## End(Not run)
# Illustrating the nonzero mode feature.
betahat=betahat+5
beta.ash = ash(betahat, sebetahat)
graphics::plot(betahat,get_pm(beta.ash))
betan.ash=ash(betahat, sebetahat,mode=5)
graphics::plot(betahat,get_pm(betan.ash))
summary(betan.ash)
# Running ash with different error models
beta.ash1 = ash(betahat, sebetahat, lik = lik_normal())
beta.ash2 = ash(betahat, sebetahat, lik = lik_t(df=4))
e = rnorm(100)+log(rf(100,df1=10,df2=10)) # simulated data with log(F) error
e.ash = ash(e,1,lik=lik_logF(df1=10,df2=10))
# Specifying the output
beta.ash = ash(betahat, sebetahat, output = c("fitted_g","logLR","lfsr"))
#Running ash with a prespecified g, rather than estimating it
beta = c(rep(0,100),rnorm(100))
sebetahat = abs(rnorm(200,0,1))
betahat = rnorm(200,beta,sebetahat)
true_g = normalmix(c(0.5,0.5),c(0,0),c(0,1)) # define true g
## Passing this g into ash causes it to i) take the sd and the means
## for each component from this g, and ii) initialize pi to the value
## from this g.
beta.ash = ash(betahat, sebetahat,g=true_g,fixg=TRUE)
# running with weights
beta.ash = ash(betahat, sebetahat, optmethod="w_mixEM",
weights = c(rep(0.5,100),rep(1,100)))
# Different algorithms can be used to compute maximumlikelihood
# estimates of the mixture weights. Here, we illustrate use of the
# EM algorithm and the (default) SQP algorithm.
set.seed(1)
betahat < c(8.115,9.027,9.289,10.097,9.463)
sebeta < c(0.6157,0.4129,0.3197,0.3920,0.5496)
fit.em < ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixEM")
fit.sqp < ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixSQP")
range(fit.em$fitted$pi  fit.sqp$fitted$pi)