brm {brms}  R Documentation 
Fit Bayesian Generalized (Non)Linear Multivariate Multilevel Models
Description
Fit Bayesian generalized (non)linear multivariate multilevel models using Stan for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit – among others – linear, robust linear, count data, survival, response times, ordinal, zeroinflated, hurdle, and even selfdefined mixture models all in a multilevel context. Further modeling options include nonlinear and smooth terms, autocorrelation structures, censored data, metaanalytic standard errors, and quite a few more. In addition, all parameters of the response distributions can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with posterior predictive checks and leaveoneout crossvalidation.
Usage
brm(
formula,
data,
family = gaussian(),
prior = NULL,
autocor = NULL,
data2 = NULL,
cov_ranef = NULL,
sample_prior = "no",
sparse = NULL,
knots = NULL,
drop_unused_levels = TRUE,
stanvars = NULL,
stan_funs = NULL,
fit = NA,
save_pars = getOption("brms.save_pars", NULL),
save_ranef = NULL,
save_mevars = NULL,
save_all_pars = NULL,
init = NULL,
inits = NULL,
chains = 4,
iter = 2000,
warmup = floor(iter/2),
thin = 1,
cores = getOption("mc.cores", 1),
threads = getOption("brms.threads", NULL),
opencl = getOption("brms.opencl", NULL),
normalize = getOption("brms.normalize", TRUE),
control = NULL,
algorithm = getOption("brms.algorithm", "sampling"),
backend = getOption("brms.backend", "rstan"),
future = getOption("future", FALSE),
silent = 1,
seed = NA,
save_model = NULL,
stan_model_args = list(),
file = NULL,
file_compress = TRUE,
file_refit = getOption("brms.file_refit", "never"),
empty = FALSE,
rename = TRUE,
...
)
Arguments
formula 
An object of class 
data 
An object of class 
family 
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a 
prior 
One or more 
autocor 
(Deprecated) An optional 
data2 
A named 
cov_ranef 
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the grouplevel effects. The names of the
matrices should correspond to columns in 
sample_prior 
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are 
sparse 
(Deprecated) Logical; indicates whether the populationlevel
design matrices should be treated as sparse (defaults to 
knots 
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See

drop_unused_levels 
Should unused factors levels in the data be
dropped? Defaults to 
stanvars 
An optional 
stan_funs 
(Deprecated) An optional character string containing
selfdefined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the

fit 
An instance of S3 class 
save_pars 
An object generated by 
save_ranef 
(Deprecated) A flag to indicate if grouplevel effects for
each level of the grouping factor(s) should be saved (default is

save_mevars 
(Deprecated) A flag to indicate if draws of latent
noisefree variables obtained by using 
save_all_pars 
(Deprecated) A flag to indicate if draws from all
variables defined in Stan's 
init 
Initial values for the sampler. If 
inits 
(Deprecated) Alias of 
chains 
Number of Markov chains (defaults to 4). 
iter 
Number of total iterations per chain (including warmup; defaults to 2000). 
warmup 
A positive integer specifying number of warmup (aka burnin)
iterations. This also specifies the number of iterations used for stepsize
adaptation, so warmup draws should not be used for inference. The number
of warmup should not be larger than 
thin 
Thinning rate. Must be a positive integer. Set 
cores 
Number of cores to use when executing the chains in parallel,
which defaults to 1 but we recommend setting the 
threads 
Number of threads to use in withinchain parallelization. For
more control over the threading process, 
opencl 
The platform and device IDs of the OpenCL device to use for
fitting using GPU support. If you don't know the IDs of your OpenCL device,

normalize 
Logical. Indicates whether normalization constants should
be included in the Stan code (defaults to 
control 
A named 
algorithm 
Character string naming the estimation approach to use.
Options are 
backend 
Character string naming the package to use as the backend for
fitting the Stan model. Options are 
future 
Logical; If 
silent 
Verbosity level between 
seed 
The seed for random number generation to make results
reproducible. If 
save_model 
Either 
stan_model_args 
A 
file 
Either 
file_compress 
Logical or a character string, specifying one of the
compression algorithms supported by 
file_refit 
Modifies when the fit stored via the 
empty 
Logical. If 
rename 
For internal use only. 
... 
Further arguments passed to Stan.
For 
Details
Fit a generalized (non)linear multivariate multilevel model via
full Bayesian inference using Stan. A general overview is provided in the
vignettes vignette("brms_overview")
and
vignette("brms_multilevel")
. For a full list of available vignettes
see vignette(package = "brms")
.
Formula syntax of brms models
Details of the formula syntax applied in brms can be found in
brmsformula
.
Families and link functions
Details of families supported by brms can be found in
brmsfamily
.
Prior distributions
Priors should be specified using the
set_prior
function. Its documentation
contains detailed information on how to correctly specify priors. To find
out on which parameters or parameter classes priors can be defined, use
default_prior
. Default priors are chosen to be
non or very weakly informative so that their influence on the results will
be negligible and you usually don't have to worry about them. However,
after getting more familiar with Bayesian statistics, I recommend you to
start thinking about reasonable informative priors for your model
parameters: Nearly always, there is at least some prior information
available that can be used to improve your inference.
Adjusting the sampling behavior of Stan
In addition to choosing the number of iterations, warmup draws, and
chains, users can control the behavior of the NUTS sampler, by using the
control
argument. The most important reason to use control
is
to decrease (or eliminate at best) the number of divergent transitions that
cause a bias in the obtained posterior draws. Whenever you see the
warning "There were x divergent transitions after warmup." you should
really think about increasing adapt_delta
. To do this, write
control = list(adapt_delta = <x>)
, where <x>
should usually
be value between 0.8
(current default) and 1
. Increasing
adapt_delta
will slow down the sampler but will decrease the number
of divergent transitions threatening the validity of your posterior
draws.
Another problem arises when the depth of the tree being evaluated in each
iteration is exceeded. This is less common than having divergent
transitions, but may also bias the posterior draws. When it happens,
Stan will throw out a warning suggesting to increase
max_treedepth
, which can be accomplished by writing control =
list(max_treedepth = <x>)
with a positive integer <x>
that should
usually be larger than the current default of 10
. For more details
on the control
argument see stan
.
Value
An object of class brmsfit
, which contains the posterior
draws along with many other useful information about the model. Use
methods(class = "brmsfit")
for an overview on available methods.
Author(s)
PaulChristian Buerkner paul.buerkner@gmail.com
References
PaulChristian Buerkner (2017). brms: An R Package for Bayesian Multilevel
Models Using Stan. Journal of Statistical Software, 80(1), 128.
doi:10.18637/jss.v080.i01
PaulChristian Buerkner (2018). Advanced Bayesian Multilevel Modeling
with the R Package brms. The R Journal. 10(1), 395–411.
doi:10.32614/RJ2018017
See Also
brms
, brmsformula
,
brmsfamily
, brmsfit
Examples
## Not run:
# Poisson regression for the number of seizures in epileptic patients
fit1 < brm(
count ~ zBase * Trt + (1patient),
data = epilepsy, family = poisson(),
prior = prior(normal(0, 10), class = b) +
prior(cauchy(0, 2), class = sd)
)
# generate a summary of the results
summary(fit1)
# plot the MCMC chains as well as the posterior distributions
plot(fit1)
# predict responses based on the fitted model
head(predict(fit1))
# plot conditional effects for each predictor
plot(conditional_effects(fit1), ask = FALSE)
# investigate model fit
loo(fit1)
pp_check(fit1)
# Ordinal regression modeling patient's rating of inhaler instructions
# category specific effects are estimated for variable 'treat'
fit2 < brm(rating ~ period + carry + cs(treat),
data = inhaler, family = sratio("logit"),
prior = set_prior("normal(0,5)"), chains = 2)
summary(fit2)
plot(fit2, ask = FALSE)
WAIC(fit2)
# Survival regression modeling the time between the first
# and second recurrence of an infection in kidney patients.
fit3 < brm(time  cens(censored) ~ age * sex + disease + (1patient),
data = kidney, family = lognormal())
summary(fit3)
plot(fit3, ask = FALSE)
plot(conditional_effects(fit3), ask = FALSE)
# Probit regression using the binomial family
ntrials < sample(1:10, 100, TRUE)
success < rbinom(100, size = ntrials, prob = 0.4)
x < rnorm(100)
data4 < data.frame(ntrials, success, x)
fit4 < brm(success  trials(ntrials) ~ x, data = data4,
family = binomial("probit"))
summary(fit4)
# Nonlinear Gaussian model
fit5 < brm(
bf(cum ~ ult * (1  exp((dev/theta)^omega)),
ult ~ 1 + (1AY), omega ~ 1, theta ~ 1,
nl = TRUE),
data = loss, family = gaussian(),
prior = c(
prior(normal(5000, 1000), nlpar = "ult"),
prior(normal(1, 2), nlpar = "omega"),
prior(normal(45, 10), nlpar = "theta")
),
control = list(adapt_delta = 0.9)
)
summary(fit5)
conditional_effects(fit5)
# Normal model with heterogeneous variances
data_het < data.frame(
y = c(rnorm(50), rnorm(50, 1, 2)),
x = factor(rep(c("a", "b"), each = 50))
)
fit6 < brm(bf(y ~ x, sigma ~ 0 + x), data = data_het)
summary(fit6)
plot(fit6)
conditional_effects(fit6)
# extract estimated residual SDs of both groups
sigmas < exp(as.data.frame(fit6, variable = "^b_sigma_", regex = TRUE))
ggplot(stack(sigmas), aes(values)) +
geom_density(aes(fill = ind))
# Quantile regression predicting the 25%quantile
fit7 < brm(bf(y ~ x, quantile = 0.25), data = data_het,
family = asym_laplace())
summary(fit7)
conditional_effects(fit7)
# use the future package for more flexible parallelization
library(future)
plan(multisession, workers = 4)
fit7 < update(fit7, future = TRUE)
# fit a model manually via rstan
scode < stancode(count ~ Trt, data = epilepsy)
sdata < standata(count ~ Trt, data = epilepsy)
stanfit < rstan::stan(model_code = scode, data = sdata)
# feed the Stan model back into brms
fit8 < brm(count ~ Trt, data = epilepsy, empty = TRUE)
fit8$fit < stanfit
fit8 < rename_pars(fit8)
summary(fit8)
## End(Not run)