ordbetareg {ordbetareg} | R Documentation |
Fit Ordered Beta Regression Model
Description
This function allows you to estimate an ordered beta regression model via a formula syntax.
The ordbetareg
package is essentially a wrapper around brms
that
enables the ordered beta regression model to be fit. This model has
advantages over other alternatives for continous data with upper
and lower bounds, such as survey sliders, indexes,
dose-response relationships,
and visual analog scales (among others). The package allows for all of the
many brms
regression modeling functions to be used with the ordered
beta regression distribution.
Usage
ordbetareg(
formula = NULL,
data = NULL,
true_bounds = NULL,
phi_reg = "none",
use_brm_multiple = FALSE,
coef_prior_mean = 0,
coef_prior_SD = 5,
intercept_prior_mean = NULL,
intercept_prior_SD = NULL,
phi_prior = 0.1,
dirichlet_prior = c(1, 1, 1),
phi_coef_prior_mean = 0,
phi_coef_prior_SD = 5,
phi_intercept_prior_mean = NULL,
phi_intercept_prior_SD = NULL,
extra_prior = NULL,
init = "0",
make_stancode = FALSE,
...
)
Arguments
formula |
Either an R formula in the form response/DV ~ var1 + var2
etc. or formula object as created/called by the |
data |
An R data frame or tibble containing the variables in the formula |
true_bounds |
If the true bounds of the outcome/response don't exist in the data, pass a length 2 numeric vector of the minimum and maximum bounds to properly normalize the outcome/response |
phi_reg |
Whether you are including a linear model predicting
the dispersion parameter, phi, and/or for the response. If you are
including models for both, pass option 'both'. If you only have an
intercept for the outcome (i.e. a 1 in place of covariates), pass 'only'.
If you only have intercepts for phi (such as a varying intercepts/random effects)
model, pass the value "intercepts". To set priors on these intercepts,
use the |
use_brm_multiple |
(T/F) Whether the model should use
brms::brm_multiple for multiple
imputation over multiple dataframes passed
as a list to the |
coef_prior_mean |
The mean of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 0. |
coef_prior_SD |
The SD of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 5, which makes the prior weakly informative on the logit scale. |
intercept_prior_mean |
The mean of the Normal distribution prior
for the intercept. By default is NULL, which means the intercept
receives the same prior as |
intercept_prior_SD |
The SD of the Normal distribution prior
for the intercept. By default is NULL, which means the intercept
receives the same prior SD as |
phi_prior |
The mean parameter of the exponential prior on phi, which determines the dispersion of the beta distribution. The default is .1, which equals a mean of 10 and is thus weakly informative on the interval (0.4, 30). If the response has very low variance (i.e. tightly) clusters around a specific value, then decreasing this prior (and increasing the expected value) may be helpful. Checking the value of phi in the output of the model command will reveal if a value of 0.1 (mean of 10) is too small. |
dirichlet_prior |
A vector of three integers corresponding to the prior parameters for the dirchlet distribution (alpha parameter) governing the location of the cutpoints between the components of the response (continuous vs. degenerate). The default is 1 which puts equal probability on degenerate versus continuous responses. Likely only needs to be changed in a repeated sampling situation to stabilize the cutpoint locations across samples. |
phi_coef_prior_mean |
The mean of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 0. |
phi_coef_prior_SD |
The SD of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 5, which makes the prior weakly informative on the exponential scale. |
phi_intercept_prior_mean |
The mean of the Normal distribution prior
for the phi (dispersion) regression intercept. By default is NULL,
which means the intercept
receives the same prior as |
phi_intercept_prior_SD |
The SD of the Normal distribution prior
for the phi (dispersion) regression intercept. By default is NULL,
which means the intercept
receives the same prior SD as |
extra_prior |
An additional prior, such as a prior for a specific
regression coefficient, added to the outcome regression by passing one of the |
init |
This parameter is used to determine starting values for
the Stan sampler to begin Markov Chain Monte Carlo sampling. It is
set by default at 0 because the non-linear nature of beta regression
means that it is possible to begin with extreme values depending on the
scale of the covariates. Setting this to 0 helps the sampler find
starting values. It does, on the other hand, limit the ability to detect
convergence issues with Rhat statistics. If that is a concern, such as
with an experimental feature of |
make_stancode |
If |
... |
All other arguments passed on to the |
Details
This function is a wrapper around the brms::brm function, which is a
powerful Bayesian regression modeling engine using Stan. To fully explore
the options available, including dynamic and hierarchical modeling, please
see the documentation for the brm
function above. As the ordered beta
regression model is currently not available in brms
natively, this modeling
function allows a brms
model to be fit with the ordered beta regression
distribution.
For more information about the model, see the paper here: https://osf.io/preprints/socarxiv/2sx6y/.
This function allows you to set priors on the dispersion parameter,
the cutpoints, and the regression coefficients (see below for options).
However, to add specific priors on individual covariates, you would need
to use the brms::set_prior function by specifying an individual covariate
(see function documentation) and passing the result of the function call
to the extra_prior
argument.
This function will also automatically normalize the outcome so that it lies in the \[0,1\] interval, as required by beta regression. For furthur information, see the documentation for the normalize function.
Priors can be set on a variety of coefficients in the model, see
the description of parameters coef_prior_mean
and intercept_prior_mean
,
in addition to setting a custom prior with the extra_prior
option.
When setting priors on intercepts, it is important to note that
by default, all intercepts in brms are centered (the means are
subtracted from the data). As a result, a prior set on the default
intercept will have a different interpretation than a traditional
intercept (i.e. the value of the outcome when the covariates are
all zero). To change this setting, use the brms::bf()
function
as a wrapper around the formula with the option center=FALSE
to
set priors on a traditional non-centered intercept.
Note that while brms
also supports adding 0 + Intercept
to the
formula to address this issue, ordbetareg
does not support this
syntax. Instead, use center=FALSE
as an option to brms::bf()
.
To learn more about how the package works, see the vignette by using
the command browseVignettes(package='ordbetareg')
.
For more info about the distribution, see this paper: https://osf.io/preprints/socarxiv/2sx6y/
To cite the package, please cite the following paper:
Kubinec, Robert. "Ordered Beta Regression: A Parsimonious, Well-Fitting Model for Continuous Data with Lower and Upper Bounds." Political Analysis. 2022.
Value
A brms
object fitted with the ordered beta regression distribution.
Examples
# load survey data that comes with the package
library(dplyr)
data("pew")
# prepare data
model_data <- select(pew,therm,
education="F_EDUCCAT2_FINAL",
region="F_CREGION_FINAL",
income="F_INCOME_FINAL")
# It takes a while to fit the models. Run the code
# below if you want to load a saved fitted model from the
# package, otherwise use the model-fitting code
data("ord_fit_mean")
# fit the actual model
if(.Platform$OS.type!="windows") {
ord_fit_mean <- ordbetareg(formula=therm ~ education + income +
(1|region),
data=model_data,
cores=2,chains=2)
}
# access values of the coefficients
summary(ord_fit_mean)