brm {brms}  R Documentation 
Fit Bayesian generalized (non)linear multivariate multilevel models using Stan for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit – among others – linear, robust linear, count data, survival, response times, ordinal, zeroinflated, hurdle, and even selfdefined mixture models all in a multilevel context. Further modeling options include nonlinear and smooth terms, autocorrelation structures, censored data, metaanalytic standard errors, and quite a few more. In addition, all parameters of the response distributions can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with posterior predictive checks and leaveoneout crossvalidation.
brm( formula, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sample_prior = "no", sparse = NULL, knots = NULL, stanvars = NULL, stan_funs = NULL, fit = NA, save_pars = NULL, save_ranef = NULL, save_mevars = NULL, save_all_pars = NULL, inits = "random", chains = 4, iter = 2000, warmup = floor(iter/2), thin = 1, cores = getOption("mc.cores", 1), threads = NULL, normalize = getOption("brms.normalize", TRUE), control = NULL, algorithm = getOption("brms.algorithm", "sampling"), backend = getOption("brms.backend", "rstan"), future = getOption("future", FALSE), silent = 1, seed = NA, save_model = NULL, stan_model_args = list(), file = NULL, file_refit = "never", empty = FALSE, rename = TRUE, ... )
formula 
An object of class 
data 
An object of class 
family 
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a 
prior 
One or more 
autocor 
(Deprecated) An optional 
data2 
A named 
cov_ranef 
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the grouplevel effects. The names of the
matrices should correspond to columns in 
sample_prior 
Indicate if samples from priors should be drawn
additionally to the posterior samples. Options are 
sparse 
(Deprecated) Logical; indicates whether the populationlevel
design matrices should be treated as sparse (defaults to 
knots 
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See

stanvars 
An optional 
stan_funs 
(Deprecated) An optional character string containing
selfdefined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the

fit 
An instance of S3 class 
save_pars 
An object generated by 
save_ranef 
(Deprecated) A flag to indicate if grouplevel effects for
each level of the grouping factor(s) should be saved (default is

save_mevars 
(Deprecated) A flag to indicate if samples of latent
noisefree variables obtained by using 
save_all_pars 
(Deprecated) A flag to indicate if samples from all
variables defined in Stan's 
inits 
Either 
chains 
Number of Markov chains (defaults to 4). 
iter 
Number of total iterations per chain (including warmup; defaults to 2000). 
warmup 
A positive integer specifying number of warmup (aka burnin)
iterations. This also specifies the number of iterations used for stepsize
adaptation, so warmup samples should not be used for inference. The number
of warmup should not be larger than 
thin 
Thinning rate. Must be a positive integer. Set 
cores 
Number of cores to use when executing the chains in parallel,
which defaults to 1 but we recommend setting the 
threads 
Number of threads to use in withinchain parallelization. For
more control over the threading process, 
normalize 
Logical. Indicates whether normalization constants should
be included in the Stan code (defaults to 
control 
A named 
algorithm 
Character string naming the estimation approach to use.
Options are 
backend 
Character string naming the package to use as the backend for
fitting the Stan model. Options are 
future 
Logical; If 
silent 
Verbosity level between 
seed 
The seed for random number generation to make results
reproducible. If 
save_model 
Either 
stan_model_args 
A 
file 
Either 
file_refit 
Modifies when the fit stored via the 
empty 
Logical. If 
rename 
For internal use only. 
... 
Further arguments passed to Stan.
For 
Fit a generalized (non)linear multivariate multilevel model via
full Bayesian inference using Stan. A general overview is provided in the
vignettes vignette("brms_overview")
and
vignette("brms_multilevel")
. For a full list of available vignettes
see vignette(package = "brms")
.
Formula syntax of brms models
Details of the formula syntax applied in brms can be found in
brmsformula
.
Families and link functions
Details of families supported by brms can be found in
brmsfamily
.
Prior distributions
Priors should be specified using the
set_prior
function. Its documentation
contains detailed information on how to correctly specify priors. To find
out on which parameters or parameter classes priors can be defined, use
get_prior
. Default priors are chosen to be
non or very weakly informative so that their influence on the results will
be negligible and you usually don't have to worry about them. However,
after getting more familiar with Bayesian statistics, I recommend you to
start thinking about reasonable informative priors for your model
parameters: Nearly always, there is at least some prior information
available that can be used to improve your inference.
Adjusting the sampling behavior of Stan
In addition to choosing the number of iterations, warmup samples, and
chains, users can control the behavior of the NUTS sampler, by using the
control
argument. The most important reason to use control
is
to decrease (or eliminate at best) the number of divergent transitions that
cause a bias in the obtained posterior samples. Whenever you see the
warning "There were x divergent transitions after warmup." you should
really think about increasing adapt_delta
. To do this, write
control = list(adapt_delta = <x>)
, where <x>
should usually
be value between 0.8
(current default) and 1
. Increasing
adapt_delta
will slow down the sampler but will decrease the number
of divergent transitions threatening the validity of your posterior
samples.
Another problem arises when the depth of the tree being evaluated in each
iteration is exceeded. This is less common than having divergent
transitions, but may also bias the posterior samples. When it happens,
Stan will throw out a warning suggesting to increase
max_treedepth
, which can be accomplished by writing control =
list(max_treedepth = <x>)
with a positive integer <x>
that should
usually be larger than the current default of 10
. For more details
on the control
argument see stan
.
An object of class brmsfit
, which contains the posterior
samples along with many other useful information about the model. Use
methods(class = "brmsfit")
for an overview on available methods.
PaulChristian Buerkner paul.buerkner@gmail.com
PaulChristian Buerkner (2017). brms: An R Package for Bayesian Multilevel
Models Using Stan. Journal of Statistical Software, 80(1), 128.
doi:10.18637/jss.v080.i01
PaulChristian Buerkner (2018). Advanced Bayesian Multilevel Modeling
with the R Package brms. The R Journal. 10(1), 395–411.
doi:10.32614/RJ2018017
brms
, brmsformula
,
brmsfamily
, brmsfit
## Not run: # Poisson regression for the number of seizures in epileptic patients # using normal priors for populationlevel effects # and halfcauchy priors for standard deviations of grouplevel effects prior1 < prior(normal(0,10), class = b) + prior(cauchy(0,2), class = sd) fit1 < brm(count ~ zAge + zBase * Trt + (1patient), data = epilepsy, family = poisson(), prior = prior1) # generate a summary of the results summary(fit1) # plot the MCMC chains as well as the posterior distributions plot(fit1, ask = FALSE) # predict responses based on the fitted model head(predict(fit1)) # plot conditional effects for each predictor plot(conditional_effects(fit1), ask = FALSE) # investigate model fit loo(fit1) pp_check(fit1) # Ordinal regression modeling patient's rating of inhaler instructions # category specific effects are estimated for variable 'treat' fit2 < brm(rating ~ period + carry + cs(treat), data = inhaler, family = sratio("logit"), prior = set_prior("normal(0,5)"), chains = 2) summary(fit2) plot(fit2, ask = FALSE) WAIC(fit2) # Survival regression modeling the time between the first # and second recurrence of an infection in kidney patients. fit3 < brm(time  cens(censored) ~ age * sex + disease + (1patient), data = kidney, family = lognormal()) summary(fit3) plot(fit3, ask = FALSE) plot(conditional_effects(fit3), ask = FALSE) # Probit regression using the binomial family ntrials < sample(1:10, 100, TRUE) success < rbinom(100, size = ntrials, prob = 0.4) x < rnorm(100) data4 < data.frame(ntrials, success, x) fit4 < brm(success  trials(ntrials) ~ x, data = data4, family = binomial("probit")) summary(fit4) # Nonlinear Gaussian model fit5 < brm( bf(cum ~ ult * (1  exp((dev/theta)^omega)), ult ~ 1 + (1AY), omega ~ 1, theta ~ 1, nl = TRUE), data = loss, family = gaussian(), prior = c( prior(normal(5000, 1000), nlpar = "ult"), prior(normal(1, 2), nlpar = "omega"), prior(normal(45, 10), nlpar = "theta") ), control = list(adapt_delta = 0.9) ) summary(fit5) conditional_effects(fit5) # Normal model with heterogeneous variances data_het < data.frame( y = c(rnorm(50), rnorm(50, 1, 2)), x = factor(rep(c("a", "b"), each = 50)) ) fit6 < brm(bf(y ~ x, sigma ~ 0 + x), data = data_het) summary(fit6) plot(fit6) conditional_effects(fit6) # extract estimated residual SDs of both groups sigmas < exp(posterior_samples(fit6, "^b_sigma_")) ggplot(stack(sigmas), aes(values)) + geom_density(aes(fill = ind)) # Quantile regression predicting the 25%quantile fit7 < brm(bf(y ~ x, quantile = 0.25), data = data_het, family = asym_laplace()) summary(fit7) conditional_effects(fit7) # use the future package for more flexible parallelization library(future) plan(multiprocess) fit7 < update(fit7, future = TRUE) # fit a model manually via rstan scode < make_stancode(count ~ Trt, data = epilepsy) sdata < make_standata(count ~ Trt, data = epilepsy) stanfit < rstan::stan(model_code = scode, data = sdata) # feed the Stan model back into brms fit8 < brm(count ~ Trt, data = epilepsy, empty = TRUE) fit8$fit < stanfit fit8 < rename_pars(fit8) summary(fit8) ## End(Not run)