R: Compute marginal posterior estimates

general_sequence_model {SequenceSpikeSlab}

R Documentation

Compute marginal posterior estimates

Description

This function computes marginal posterior probabilities (slab probabilities) that data points have non-zero mean for the general hierarchical prior in the sparse normal sequence model. The posterior mean is also provided.

Usage

general_sequence_model(
  x,
  sigma = 1,
  slab = "Laplace",
  log_prior = "beta-binomial",
  Laplace_lambda = 0.5,
  Cauchy_gamma = 1,
  beta_kappa = 1,
  beta_lambda,
  show_progress = TRUE
)

Arguments

`x`	Vector of n data points
`sigma`	Standard deviation of the Gaussian noise in the data. May also be set to "auto", in which case sigma is estimated using the estimateSigma function from the selectiveInference package
`slab`	Slab distribution. Must be either "Laplace" or "Cauchy".
`log_prior`	Vector of length n+1 containing the logarithms of the prior probabilities pi_n(s) that the number of spikes is equal to s for s=0,...,n. It is allowed to use an unnormalized prior that does not sum to 1, because adding any constant to the log-prior probabilities does not change the result. Instead of a vector, log_prior may also be set to "beta-binomial" as a short-hand for log_prior = lbeta(beta_kappa+(0:n),beta_lambda+n-(0:n)) - lbeta(beta_kappa,beta_lambda) + lchoose(n,0:n).
`Laplace_lambda`	Parameter of the Laplace slab
`Cauchy_gamma`	Parameter of the Cauchy slab
`beta_kappa`	Parameter of the beta-distribution in the beta-binomial prior
`beta_lambda`	Parameter of the beta-distribution in the beta-binomial prior. Default value=n+1
`show_progress`	Boolean that indicates whether to show a progress bar

Details

The run-time is O(n^2) on n data points, which means that doubling the size of the data leads to an increase in computation time by approximately a factor of 4. Data sets of size n=25,000 should be feasible within approximately 30 minutes.

Value

list (postprobs, postmean, sigma), where postprobs is a vector of marginal posterior slab probabilities that x[i] has non-zero mean for i=1,...,n; postmean is a vector with the posterior mean for the x[i]; and sigma is the value of sigma (this may be of interest when the sigma="auto" option is used)

Examples

# Experiments similar to those of Castilo, Van der Vaart, 2012

# Generate data
n <- 500           # sample size
n_signal <- 25     # number of non-zero theta
A <- 5             # signal strength
theta <- c(rep(A,n_signal), rep(0,n-n_signal))
x <- theta + rnorm(n, sd=1)

# Choose slab
slab <- "Laplace"
Laplace_lambda <- 0.5

# Prior 1
kappa1 <- 0.4   # hyperparameter
logprior1 <- c(0,-kappa1*(1:n)*log(n*3/(1:n)))
res1 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior=logprior1,
                               Laplace_lambda=Laplace_lambda)
print("Prior 1: Elements with marginal posterior probability >= 0.5:")
print(which(res1$postprobs >= 0.5))

# Prior 2
kappa2 <- 0.8   # hyperparameter
logprior2 <- kappa2*lchoose(2*n-0:n,n)
res2 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior=logprior2,
                               Laplace_lambda=Laplace_lambda)
print("Prior 2: Elements with marginal posterior probability >= 0.5:")
print(which(res2$postprobs >= 0.5))

# Prior 3
beta_kappa <- 1      # hyperparameter
beta_lambda <- n+1   # hyperparameter
res3 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior="beta-binomial",
                               Laplace_lambda=Laplace_lambda)
print("Prior 3: Elements with marginal posterior probability >= 0.5:")
print(which(res3$postprobs >= 0.5))

# Plot means for all priors
M=max(abs(x))+1
plot(1:n, x, pch=20, ylim=c(-M,M), col='green', xlab="", ylab="", main="Posterior Means")
points(1:n, theta, pch=20, col='blue')
points(1:n, res1$postmean, pch=20, col='black', cex=0.6)
points(1:n, res2$postmean, pch=20, col='magenta', cex=0.6)
points(1:n, res3$postmean, pch=20, col='red', cex=0.6)
legend("topright", legend=c("posterior mean 1", "posterior mean 2", "posterior mean 3",
                            "data", "truth"),
       col=c("black", "magenta", "red", "green", "blue"), pch=20, cex=0.7)

[Package SequenceSpikeSlab version 1.0.1 Index]