mcmcObsProb {BayesPostEst} | R Documentation |
Predicted Probabilities using Bayesian MCMC estimates for the Average of Observed Cases
Description
Implements R function to calculate the predicted probabilities for "observed" cases after a Bayesian logit or probit model, following Hanmer & Kalkan (2013) (2013, American Journal of Political Science 57(1): 263-277).
Usage
mcmcObsProb(
modelmatrix,
mcmcout,
xcol,
xrange,
xinterest,
link = "logit",
ci = c(0.025, 0.975),
fullsims = FALSE
)
Arguments
modelmatrix |
model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
mcmcout |
posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
xcol |
column number of the posterior draws ( |
xrange |
name of the vector with the range of relevant values of the explanatory variable for which to calculate associated Pr(y = 1). |
xinterest |
semi-optional argument. Name of the explanatory variable for which
to calculate associated Pr(y = 1). If |
link |
type of generalized linear model; a character vector set to |
ci |
the bounds of the credible interval. Default is |
fullsims |
logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
Details
This function calculates predicted probabilities for "observed" cases after a Bayesian logit or probit model following Hanmer and Kalkan (2013, American Journal of Political Science 57(1): 263-277)
Value
if fullsims = FALSE
(default), a tibble with 4 columns:
x: value of variable of interest, drawn from
xrange
median_pp: median predicted Pr(y = 1) when variable of interest is set to x
lower_pp: lower bound of credible interval of predicted probability at given x
upper_pp: upper bound of credible interval of predicted probability at given x
if fullsims = TRUE
, a tibble with 3 columns:
Iteration: number of the posterior draw
x: value of variable of interest, drawn from
xrange
pp: average predicted Pr(y = 1) of all observed cases when variable of interest is set to x
References
Hanmer, Michael J., & Ozan Kalkan, K. (2013). Behind the curve: Clarifying the best approach to calculating predicted probabilities and marginal effects from limited dependent variable models. American Journal of Political Science, 57(1), 263-277. https://doi.org/10.1111/j.1540-5907.2012.00602.x
Examples
if (interactive()) {
## simulating data
set.seed(12345)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
### observed value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
to = max(datjags$X1),
length.out = 10)
obs_prob <- mcmcObsProb(modelmatrix = xmat,
mcmcout = mcmc_mat,
xrange = X1_sim,
xcol = 2)
}