ebayesthresh {EbayesThresh} | R Documentation |
Empirical Bayes thresholding on a sequence
Description
Given a sequence of data, performs Empirical Bayes thresholding, as discussed in Johnstone and Silverman (2004).
Usage
ebayesthresh(x, prior = "laplace", a = 0.5, bayesfac = FALSE,
sdev = NA, verbose = FALSE, threshrule = "median", universalthresh = TRUE,
stabadjustment)
Arguments
x |
Vector of data values. |
prior |
Specification of prior to be used conditional on the
mean being nonzero; can be |
a |
Scale factor if Laplace prior is used. Ignored if Cauchy prior
is used. If, on entry, |
bayesfac |
If |
sdev |
The sampling standard deviation of the data |
verbose |
Controls the level of output. See below. |
threshrule |
Specifies the thresholding rule to be applied to the
data. Possible values are |
universalthresh |
If |
stabadjustment |
If
|
Details
It is assumed that the data vector (x_1, \ldots, x_n)
is such that
each x_i
is drawn independently from a normal distribution with
mean \theta_i
and variance \sigma_i^2
(\sigma_i
is the
same in the homogeneous case). The prior distribution of each
\theta_i
is a mixture with probability 1-w
of zero and
probability w
of a given symmetric heavy-tailed distribution. The
mixing weight, and possibly a scale factor in the symmetric
distribution, are estimated by marginal maximum likelihood. The
resulting values are used as the hyperparameters in the prior.
The true effects \theta_i
can be estimated as the posterior median
or the posterior mean given the data, or by hard or soft thresholding
using the posterior median threshold. If hard or soft thresholding is
chosen, then there is the additional choice of using the Bayes factor
threshold, which is the value such thatthe posterior probability of zero
is exactly half if the data value is equal to the threshold.
Value
If verbose = FALSE
, a vector giving the values of the estimates
of the underlying mean vector.
If verbose = TRUE
, a list with the following elements:
muhat |
the estimated mean vector (omitted if |
x |
the data vector as supplied |
threshold.sdevscale |
the threshold as a multiple of the standard
deviation |
threshold.origscale |
the threshold measured on the original scale of the data |
prior |
the prior that was used |
w |
the mixing weight as estimated by marginal maximum likelihood |
a |
(only present if Laplace prior used) the scale factor as supplied or estimated |
bayesfac |
the value of the parameter |
sdev |
the standard deviations of the data as supplied or estimated |
threshrule |
the thresholding rule used, as specified above |
Author(s)
Bernard Silverman
References
Johnstone, I. M. and Silverman, B. W. (2004) Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Annals of Statistics, 32, 1594–1649.
Johnstone, I. M. and Silverman, B. W. (2004) EbayesThresh: R software for Empirical Bayes thresholding. Journal of Statistical Software, 12.
Johnstone, I. M. (2004) 'Function Estimation and Classical Normal Theory' ‘The Threshold Selection Problem’. The Wald Lectures I and II, 2004. Available from http://www-stat.stanford.edu/~imj.
Johnstone, I. M. and Silverman, B. W. (2005) Empirical Bayes selection of wavelet thresholds. Annals of Statistics, 33, 1700–1752.
The papers by Johnstone and Silverman are available from http://www.bernardsilverman.com.
See also http://www-stat.stanford.edu/~imj for further references, including the draft of a monograph by I. M. Johnstone.
See Also
Examples
# Data with homogeneous sampling standard deviation using
# Cauchy prior.
eb1 <- ebayesthresh(x = rnorm(100, c(rep(0,90),rep(5,10))),
prior = "cauchy", sdev = NA)
# Data with homogeneous sampling standard deviation using
# Laplace prior.
eb2 <- ebayesthresh(x = rnorm(100, c(rep(0,90), rep(5,10))),
prior = "laplace", sdev = 1)
# Data with heterogeneous sampling standard deviation using
# Laplace prior.
set.seed(123)
mu <- c(rep(0,90), rep(5,10))
sd <- c(rep(1, 40), rep(3, 60))
x <- mu + rnorm(100, sd = sd)
# With constraints on thresholds.
eb3 <- ebayesthresh(x = x, prior = "laplace", a = NA, sdev = sd)
# Without constraints on thresholds. Observe that the estimates with
# constraints on thresholds have fewer zeroes than the estimates without
# constraints.
eb4 <- ebayesthresh(x = x, prior = "laplace", a = NA, sdev = sd,
universalthresh = FALSE)
print(sum(eb3 == 0))
print(sum(eb4 == 0))
# Data with heterogeneous sampling standard deviation using Laplace
# prior.
set.seed(123)
mu <- c(rep(0,90), rep(5,10))
sd <- c(rep(1, 40), rep(5,40), rep(15, 20))
x <- mu + rnorm(100, sd = sd)
# In this example, infinity is returned as estimate when some of the
# sampling standard deviations are extremely large. However, this can
# be solved by stabilizing the data sequence before the analysis.
eb5 <- ebayesthresh(x = x, prior = "laplace", a = NA, sdev = sd)
# With stabilization.
eb6 <- ebayesthresh(x = x, prior = "laplace", a = NA, sdev = sd,
stabadjustment = TRUE)