baggr {baggr}  R Documentation 
Bayesian inference on parameters of an average treatment effects model that's appropriate to the supplied individual or grouplevel data, using Hamiltonian Monte Carlo in Stan. (For overall package help file see baggrpackage)
baggr( data, model = NULL, pooling = "partial", effect = NULL, covariates = c(), prior_hypermean = NULL, prior_hypersd = NULL, prior_hypercor = NULL, prior_beta = NULL, prior_control = NULL, prior_control_sd = NULL, prior = NULL, ppd = FALSE, pooling_control = "none", test_data = NULL, quantiles = seq(0.05, 0.95, 0.1), outcome = "outcome", group = "group", treatment = "treatment", silent = FALSE, warn = TRUE, ... )
data 
data frame with summary or individual level data to metaanalyse 
model 
if 
pooling 
Type of pooling;
choose from 
effect 
Label for effect. Will default to "mean" in most cases, "log OR" in logistic model,
quantiles in 
covariates 
Character vector with column names in 
prior_hypermean 
prior distribution for hypermean; you can use "plain text" notation like

prior_hypersd 
prior for hyperstandard deviation, used
by Rubin and 
prior_hypercor 
prior for hypercorrelation matrix, used by the 
prior_beta 
prior for regression coefficients if 
prior_control 
prior for the mean in the control arm (baseline), currently used in 
prior_control_sd 
prior for the SD in the control arm (baseline), currently used in 
prior 
alternative way to specify all priors as a named list with 
ppd 
logical; use prior predictive distribution? (p.p.d.)
If 
pooling_control 
Pooling for groupspecific control mean terms (currently only in 
test_data 
data for crossvalidation; NULL for no validation, otherwise a data frame
with the same columns as 
quantiles 
if 
outcome 
character; column name in (individuallevel)

group 
character; column name in 
treatment 
character; column name in (individuallevel) 
silent 
Whether to silence messages about prior settings and about other automatic behaviour. 
warn 
print an additional warning if Rhat exceeds 1.05 
... 
extra options passed to Stan function, e.g. 
Running baggr
requires 1/ data preparation, 2/ choice of model, 3/ choice of priors.
All three are discussed in depth in the package vignette (vignette("baggr")
).
Data. For aggregate data models you need a data frame with columns
tau
and se
or tau
, mu
, se.tau
, se.mu
.
An additional column can be used to provide labels for each group
(by default column group
is used if available, but this can be
customised – see the example below).
For individual level data three columns are needed: outcome, treatment, group. These
are identified by using the outcome
, treatment
and group
arguments.
Many data preparation steps can be done through a helper function prepare_ma.
It can convert individual to summarylevel data, calculate
odds/risk ratios (with/without corrections) in binary data, standardise variables and more.
Using it will automatically format data inputs to work with baggr()
.
Models. Available models are:
for the continuous variable means:
"rubin"
model for average treatment effect (using summary data), "mutau"
version which takes into account means of control groups (also using summary data),
"rubin_full"
, which is the same model as "rubin"
but works with individuallevel data
for continuous variable quantiles: '"quantiles"“ model (see Meager, 2019 in references)
for mixture data: "sslab"
(experimental)
for binary data: "logit"
model can be used on individuallevel data;
you can also analyse continuous statistics such as
log odds ratios and logs risk ratios using the models listed above;
see vignette("baggr_binary")
for tutorial with examples
If no model is specified, the function tries to infer the appropriate model automatically. Additionally, the user must specify type of pooling. The default is always partial pooling.
Covariates.
Both aggregate and individuallevel data can include extra columns, given by covariates
argument
(specified as a character vector of column names) to be used in regression models.
We also refer to impact of these covariates as fixed effects.
Two types of covariates may be present in your data:
In "rubin"
and "mutau"
models, covariates that change according to group unit.
In that case, the model accounting
for the group covariates is a
metaregression
model. It can be modelled on summarylevel data.
In "logit"
and "rubin_full"
models, covariates that change according to individual unit.
Then, such a model is commonly referred to as a
mixed model
. It has to be fitted to individuallevel data. Note that metaregression is a special
case of a mixed model for individuallevel data.
Priors. It is optional to specify priors yourself,
as the package will try propose an appropriate
prior for the input data if you do not pass a prior
argument.
To set the priors yourself, use prior_
arguments. For specifying many priors at once
(or reusing between models), a single prior = list(...)
argument can be used instead.
Meaning of the prior parameters may slightly change from model to model.
Details and examples are given in vignette("baggr")
.
Setting ppd=TRUE
can be used to obtain prior predictive distributions,
which is useful for understanding the prior assumptions,
especially useful in conjunction with effect_plot. You can also baggr_compare
different priors by setting baggr_compare(..., compare="prior")
.
Crossvalidation. When test_data
are specified, an extra parameter, the
log predictive density, will be returned by the model.
(The fitted model itself is the same regardless of whether there are test_data
.)
To understand this parameter, see documentation of loocv, a function that
can be used to assess out of sample prediction of the model using all available data.
If using individuallevel data model, test_data
should only include treatment arms
of the groups of interest. (This is because in crossvalidation we are not typically
interested in the model's ability to fit heterogeneity in control arms, but
only heterogeneity in treatment arms.)
For using aggregate level data, there is no such restriction.
Outputs. By default, some outputs are printed. There is also a
plot method for baggr objects which you can access via baggr_plot (or simply plot()
).
Other standard functions for working with baggr
object are
treatment_effect for distribution of hyperparameters
group_effects for distributions of groupspecific parameters
fixed_effects for coefficients in (meta)regression
effect_draw and effect_plot for posterior predictive distributions
baggr_compare for comparing multiple baggr
models
loocv for crossvalidation
pp_check for posterior predictive checks
baggr
class structure: a list including Stan model fit
alongside input data, pooling metrics, various model properties.
If test data is used, mean value of 2*lpd is reported as mean_lpd
Witold Wiecek, Rachael Meager
df_pooled < data.frame("tau" = c(1, 1, .5, .5, .7, .7, 1.3, 1.3), "se" = rep(1, 8), "state" = datasets::state.name[1:8]) baggr(df_pooled) #baggr automatically detects the input data # same model, but with correct labels, # different pooling & passing some options to Stan baggr(df_pooled, group = "state", pooling = "full", iter = 500) # model with nondefault (and very informative) priors baggr(df_pooled, prior_hypersd = normal(0, 2)) # "mu & tau" model, using a builtin dataset # prepare_ma() can summarise individuallevel data ms < microcredit_simplified microcredit_summary_data < prepare_ma(ms, outcome = "consumption") baggr(microcredit_summary_data, model = "mutau", iter = 500, #this is just for illustration  don't set it this low normally! pooling = "partial", prior_hypercor = lkj(1), prior_hypersd = normal(0,10), prior_hypermean = multinormal(c(0,0),matrix(c(10,3,3,10),2,2)))