R: Bayesian Simulation in Conjugate Linear Model Framework

bayes_sim {bayesassurance}

R Documentation

Bayesian Simulation in Conjugate Linear Model Framework

Description

Approximates the Bayesian assurance of attaining either u'\beta > C, u'\beta < C, or u'\beta \neq C, for equal-sized samples through Monte Carlo sampling. The function also carries the capability to process longitudinal data. See Argument descriptions for more detail.

Usage

bayes_sim(
  n,
  p = NULL,
  u,
  C,
  Xn = NULL,
  Vn = NULL,
  Vbeta_d,
  Vbeta_a_inv,
  sigsq,
  mu_beta_d,
  mu_beta_a,
  alt = "two.sided",
  alpha,
  mc_iter,
  longitudinal = FALSE,
  ids = NULL,
  from = NULL,
  to = NULL,
  poly_degree = NULL
)

Arguments

`n`	sample size (either scalar or vector). When `longitudinal = TRUE`, `n` denotes the number of observations per subject.
`p`	column dimension of design matrix `Xn`. If `Xn = NULL`, `p` must be specified to denote the column dimension of the default design matrix generated by the function.
`u`	a scalar or vector included in the expression to be evaluated, e.g. `u'\beta > C,` where `\beta` is an unknown parameter that is to be estimated.
`C`	constant to be compared
`Xn`	design matrix that characterizes where the data is to be generated from. This is specifically given by the normal linear regression model `yn = Xn\beta + \epsilon,` `\epsilon ~ N(0, \sigma^2 Vn).` When set to `NULL`, `Xn` is generated in-function using either `bayesassurance::gen_Xn()` or `bayesassurance::gen_Xn_longitudinal()`. Note that setting `Xn = NULL` also enables user to pass in a vector of sample sizes to undergo evaluation as the function will automatically adjust `Xn` accordingly based on the sample size.
`Vn`	a correlation matrix for the marginal distribution of the sample data `yn`. Takes on an identity matrix when set to `NULL`.
`Vbeta_d`	correlation matrix that helps describe the prior information on `\beta` in the design stage
`Vbeta_a_inv`	inverse-correlation matrix that helps describe the prior information on `\beta` in the analysis stage
`sigsq`	a known and fixed constant preceding all correlation matrices `Vn`, `Vbeta_d`, and `Vbeta_a_inv`.
`mu_beta_d`	design stage mean
`mu_beta_a`	analysis stage mean
`alt`	specifies alternative test case, where alt = "greater" tests if `u'\beta > C`, alt = "less" tests if `u'\beta < C`, and alt = "two.sided" performs a two-sided test. By default, alt = "greater".
`alpha`	significance level
`mc_iter`	number of MC samples evaluated under the analysis objective
`longitudinal`	when set to `TRUE`, constructs design matrix using inputs that correspond to a balanced longitudinal study design.
`ids`	vector of unique subject ids, usually of length 2 for study design purposes.
`from`	start time of repeated measures for each subject
`to`	end time of repeated measures for each subject
`poly_degree`	only needed if `longitudinal = TRUE`, specifies highest degree taken in the longitudinal model.

Value

a list of objects corresponding to the assurance approximations

assurance_table: table of sample size and corresponding assurance values
assur_plot: plot of assurance values
mc_samples: number of Monte Carlo samples that were generated and evaluated

Examples


## Example 1
## A single Bayesian assurance value obtained given a scalar sample size
## n and p=1. Note that setting p=1 implies that
## beta is a scalar parameter.

bayesassurance::bayes_sim(n=100, p = 1, u = 1, C = 0.15, Xn = NULL, 
Vbeta_d = 1e-8, Vbeta_a_inv = 0, Vn = NULL, sigsq = 0.265, mu_beta_d = 0.3, 
mu_beta_a = 0, alt = "two.sided", alpha = 0.05, mc_iter = 5000)


## Example 2
## Illustrates a scenario in which weak analysis priors and strong 
## design priors are assigned to enable overlap between the frequentist 
## power and Bayesian assurance.


library(ggplot2)
n <- seq(100, 250, 5)

 ## Frequentist Power
 power <- bayesassurance::pwr_freq(n, sigsq = 0.265, theta_0 = 0.15,
 theta_1 = 0.25, alt = "greater", alpha = 0.05)

 ## Bayesian simulation values with specified values from the n vector
 assurance <- bayesassurance::bayes_sim(n, p = 1, u = 1, C = 0.15, Xn = NULL,
 Vbeta_d = 1e-8, Vbeta_a_inv = 0, Vn = NULL, sigsq = 0.265, mu_beta_d = 0.25,
 mu_beta_a = 0, alt = "greater", alpha = 0.05, mc_iter = 1000)

## Visual representation of plots overlayed on top of one another
df1 <- as.data.frame(cbind(n, power = power$pwr_table$Power))
df2 <- as.data.frame(cbind(n, assurance = 
assurance$assurance_table$Assurance))

plot_curves <- ggplot2::ggplot(df1, alpha = 0.5, ggplot2::aes(x = n, y = power,
color="Frequentist")) + ggplot2::geom_line(lwd=1.2)
plot_curves <- plot_curves + ggplot2::geom_point(data = df2, alpha = 0.5,
aes(x = n, y = assurance, color="Bayesian"),lwd=1.2) +
ggplot2::ggtitle("Bayesian Simulation vs. Frequentist Power Computation")
plot_curves


## Example 3
## Longitudinal example where n now denotes the number of repeated measures 
## per subject and design matrix is determined accordingly.

## subject ids
n <- seq(10, 100, 5)
ids <- c(1,2)
sigsq <- 100
Vbeta_a_inv <- matrix(rep(0, 16), nrow = 4, ncol = 4)
Vbeta_d <- (1 / sigsq) * 
matrix(c(4, 0, 3, 0, 0, 6, 0, 0, 3, 0, 4, 0, 0, 0, 0, 6), 
nrow = 4, ncol = 4)

assur_out <- bayes_sim(n = n, p = NULL, u = c(1, -1, 1, -1), C = 0, 
                      Xn = NULL, Vbeta_d = Vbeta_d, 
                      Vbeta_a_inv = Vbeta_a_inv,
                      Vn = NULL, sigsq = 100,
                      mu_beta_d = as.matrix(c(5, 6.5, 62, 84)),
                      mu_beta_a = as.matrix(rep(0, 4)), mc_iter = 1000,
                      alt = "two.sided", alpha = 0.05, 
                      longitudinal = TRUE, ids = ids,
                      from = 10, to = 120)
assur_out$assurance_plot