normalizing.constant {BayesPPD} | R Documentation |
Function for approximating the normalizing constant for generalized linear models with random a0
Description
This function returns a vector of coefficients that defines a function f(a_0)
that approximates the normalizing constant for generalized linear models with random a_0
.
The user should input the values returned to glm.random.a0
or power.glm.random.a0
.
Usage
normalizing.constant(
grid,
historical,
data.type,
data.link,
prior.beta.var = rep(10, 50),
lower.limits = rep(-100, 50),
upper.limits = rep(100, 50),
slice.widths = rep(1, 50),
nMC = 10000,
nBI = 250
)
Arguments
grid |
Matrix of potential values for |
historical |
List of historical dataset(s). East historical dataset is stored in a list which constains two named elements:
For binomial data, an additional element
|
data.type |
Character string specifying the type of response. The options are "Bernoulli", "Binomial", "Poisson" and "Exponential". |
data.link |
Character string specifying the link function. The options are "Logistic", "Probit", "Log", "Identity-Positive", "Identity-Probability" and "Complementary Log-Log". Does not apply if |
prior.beta.var |
Vector of variances of the independent normal initial priors on |
lower.limits |
Vector of lower limits for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is -100 for all parameters (may not be appropriate for all situations). Does not apply if |
upper.limits |
Vector of upper limits for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is 100 for all parameters (may not be appropriate for all situations). Does not apply if |
slice.widths |
Vector of initial slice widths for parameters to be used by the slice sampler. The length of the vector should be equal to the total number of parameters, i.e. P+1 where P is the number of covariates. The default is 1 for all parameter (may not be appropriate for all situations). Does not apply if |
nMC |
Number of iterations (excluding burn-in samples) for the slice sampler or Gibbs sampler. The default is 10,000. |
nBI |
Number of burn-in samples for the slice sampler or Gibbs sampler. The default is 250. |
Details
This function performs the following steps:
Suppose there are K historical datasets. The user inputs a grid of M rows and K columns of potential values for
a_0
. For example, one can choose the vectorv = c(0.1, 0.25, 0.5, 0.75, 1)
and useexpand.grid(a0_1=v, a0_2=v, a0_3=v)
whenK=3
to get a grid withM=5^3=125
rows and 3 columns. If there are more than three historical datasets, the dimension ofv
can be reduced to limit the size of the grid. A large grid will increase runtime.For each row of
a_0
values in the grid, obtainM
samples for\beta
from the power prior associated with the current values ofa_0
using the slice sampler.For each of the M sets of posterior samples, execute the PWK algorithm (Wang et al., 2018) to estimate the log of normalizing constant
d_1,...,d_M
for the normalized power prior.At this point, one has a dataset with outcomes
d_1,...,d_M
and predictors corresponding to the rows of thea_0
grid matrix. A polynomial regression is applied to estimate a functiond=f(a0)
. The degree of the polynomial regression is determined by the algorithm to ensureR^2 > 0.99
.The vector of coefficients from the polynomial regression model is returned by the function, which the user must input into
glm.random.a0
orpower.glm.random.a0
.
When a row of the grid
contains elements that are close to zero, the resulting power prior will be flat and estimates of normalizing constants may be inaccurate.
Therefore, it is recommended that grid
values should be at least 0.05.
If one encounters the error message "some coefficients are not defined because of singularities",
it could be due to the following factors: number of grid
rows too large or too small, insufficient sample size of the historical data, insufficient number of iterations for the slice sampler,
or near-zero grid
values.
Note that due to computational intensity, the normalizing.constant
function has not been evaluated for accuracy for high dimensional \beta
(e.g., dimension > 10) or high dimensional a_0
(e.g., dimension > 5).
Value
Vector of coefficients for a_0
that defines a function f(a_0)
that approximates the normalizing constant, necessary for functions glm.random.a0
and power.glm.random.a0
.
The length of the vector is equal to 1+K*L where K is the number of historical datasets and L is the degree of the polynomial regression determined by the algorithm.
References
Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O. A New Monte Carlo Method for Estimating Marginal Likelihoods. Bayesian Anal. 13 (2018), no. 2, 311–333.
See Also
glm.random.a0
and power.glm.random.a0
Examples
data.type <- "Bernoulli"
data.link <- "Logistic"
data.size <- 50
# Simulate two historical datasets
p <- 1
set.seed(111)
x1 <- matrix(rnorm(p*data.size),ncol=p,nrow=data.size)
set.seed(222)
x2 <- matrix(rnorm(p*data.size),ncol=p,nrow=data.size)
beta <- c(1,2)
mean1 <- exp(x1*beta)/(1+exp(x1*beta))
mean2 <- exp(x2*beta)/(1+exp(x2*beta))
historical <- list(list(y0=rbinom(data.size,size=1,prob=mean1),x0=x1),
list(y0=rbinom(data.size, size=1, prob=mean2),x0=x2))
# Create grid of possible values of a0 with two columns corresponding to a0_1 and a0_2
g <- c(0.1, 0.25, 0.5, 0.75, 1)
grid <- expand.grid(a0_1=g, a0_2=g)
nMC <- 100 # nMC should be larger in practice
nBI <- 50
result <- normalizing.constant(grid=grid, historical=historical,
data.type=data.type, data.link=data.link,
nMC=nMC, nBI=nBI)