R: Fit sparse variational Bayesian proportional hazards models.

svb.fit {survival.svb}

R Documentation

Fit sparse variational Bayesian proportional hazards models.

Description

Fit sparse variational Bayesian proportional hazards models.

Usage

svb.fit(
  Y,
  delta,
  X,
  lambda = 1,
  a0 = 1,
  b0 = ncol(X),
  mu.init = NULL,
  s.init = rep(0.05, ncol(X)),
  g.init = rep(0.5, ncol(X)),
  maxiter = 1000,
  tol = 0.001,
  alpha = 1,
  center = TRUE,
  verbose = TRUE
)

Arguments

`Y`	Failure times.
`delta`	Censoring indicator, 0: censored, 1: uncensored.
`X`	Design matrix.
`lambda`	Penalisation parameter, default: `lambda=1`.
`a0`	Beta distribution parameter, default: `a0=1`.
`b0`	Beta distribution parameter, default: `b0=ncol(X)`.
`mu.init`	Initial value for the mean of the Gaussian component of the variational family (`\mu`), default taken from LASSO fit.
`s.init`	Initial value for the standard deviations of the Gaussian component of the variational family (`s`), default: `rep(0.05, ncol(X))`.
`g.init`	Initial value for the inclusion probability (`\gamma`), default: `rep(0.5, ncol(X))`.
`maxiter`	Maximum number of iterations, default: `1000`.
`tol`	Convergence tolerance, default: `0.001`.
`alpha`	The elastic-net mixing parameter used for initialising `mu.init`. When `alpha=1` the lasso penalty is used and `alpha=0` the ridge penalty, values between 0 and 1 give a mixture of the two penalties, default: `1`.
`center`	Center X prior to fitting, increases numerical stability, default: `TRUE`
`verbose`	Print additional information: default: `TRUE`.

Value

Returns a list containing:

`beta_hat`	Point estimate for the coefficients `\beta`, taken as the mean under the variational approximation. `\hat{\beta}_j = E_{\tilde{\Pi}} [ \beta_j ] = \gamma_j \mu_j`.
`inclusion_prob`	Posterior inclusion probabilities. Used to describe the posterior probability a coefficient is non-zero.
`m`	Final value for the means of the Gaussian component of the variational family `\mu`.
`s`	Final value for the standard deviation of the Gaussian component of the variational family `s`.
`g`	Final value for the inclusion probability (`\gamma`).
`lambda`	Value of lambda used.
`a0`	Value of `\alpha_0` used.
`b0`	Value of `\beta_0` used.
`converged`	Describes whether the algorithm converged.

Details

Rather than compute the posterior using MCMC, we turn to approximating it using variational inference. Within variational inference we re-cast Bayesian inference as an optimisation problem, where we minimize the Kullback-Leibler (KL) divergence between a family of tractable distributions and the posterior, \Pi.

In our case we use a mean-field variational family,

Q = \{ \prod_{j=1}^p \gamma_j N(\mu_j, s_j^2) + (1 - \gamma_j) \delta_0 \}

where \mu_j is the mean and s_j the std. dev for the Gaussian component, \gamma_j the inclusion probabilities, \delta_0 a Dirac mass at zero and p the number of coefficients.

The components of the variational family (\mu, s, \gamma) are then optimised by minimizing the Kullback-Leibler divergence between the variational family and the posterior,

\tilde{\Pi} = \arg \min KL(Q \| \Pi).

We use co-ordinate ascent variational inference (CAVI) to solve the above optimisation problem.

Examples

n <- 125                        # number of sample
p <- 250                        # number of features
s <- 5                          # number of non-zero coefficients
censoring_lvl <- 0.25           # degree of censoring


# generate some test data
set.seed(1)
b <- sample(c(runif(s, -2, 2), rep(0, p-s)))
X <- matrix(rnorm(n * p), nrow=n)
Y <- log(1 - runif(n)) / -exp(X %*% b)
delta  <- runif(n) > censoring_lvl   		# 0: censored, 1: uncensored
Y[!delta] <- Y[!delta] * runif(sum(!delta))	# rescale censored data


# fit the model
f <- survival.svb::svb.fit(Y, delta, X, mu.init=rep(0, p))


## Larger Example
n <- 250                        # number of sample
p <- 1000                       # number of features
s <- 10                         # number of non-zero coefficients
censoring_lvl <- 0.4            # degree of censoring


# generate some test data
set.seed(1)
b <- sample(c(runif(s, -2, 2), rep(0, p-s)))
X <- matrix(rnorm(n * p), nrow=n)
Y <- log(1 - runif(n)) / -exp(X %*% b)
delta  <- runif(n) > censoring_lvl   		# 0: censored, 1: uncensored
Y[!delta] <- Y[!delta] * runif(sum(!delta))	# rescale censored data


# fit the model
f <- survival.svb::svb.fit(Y, delta, X)


# plot the results
plot(b, xlab=expression(beta), main="Coefficient value", pch=8, ylim=c(-2,2))
points(f$beta_hat, pch=20, col=2)
legend("topleft", legend=c(expression(beta), expression(hat(beta))),
       pch=c(8, 20), col=c(1, 2))
plot(f$inclusion_prob, main="Inclusion Probabilities", ylab=expression(gamma))

[Package survival.svb version 0.0-2 Index]