R: Fit Ordered Beta Regression Model

ordbetareg {ordbetareg}

R Documentation

Fit Ordered Beta Regression Model

Description

This function allows you to estimate an ordered beta regression model via a formula syntax.

The ordbetareg package is essentially a wrapper around brms that enables the ordered beta regression model to be fit. This model has advantages over other alternatives for continous data with upper and lower bounds, such as survey sliders, indexes, dose-response relationships, and visual analog scales (among others). The package allows for all of the many brms regression modeling functions to be used with the ordered beta regression distribution.

Usage

ordbetareg(
  formula = NULL,
  data = NULL,
  true_bounds = NULL,
  phi_reg = "none",
  use_brm_multiple = FALSE,
  coef_prior_mean = 0,
  coef_prior_SD = 5,
  intercept_prior_mean = NULL,
  intercept_prior_SD = NULL,
  phi_prior = 0.1,
  dirichlet_prior = c(1, 1, 1),
  phi_coef_prior_mean = 0,
  phi_coef_prior_SD = 5,
  phi_intercept_prior_mean = NULL,
  phi_intercept_prior_SD = NULL,
  extra_prior = NULL,
  init = "0",
  make_stancode = FALSE,
  ...
)

Arguments

`formula`	Either an R formula in the form response/DV ~ var1 + var2 etc. or formula object as created/called by the `brms` brms::bf function. *Please avoid using 0 or `Intercept` in the formula definition.
`data`	An R data frame or tibble containing the variables in the formula
`true_bounds`	If the true bounds of the outcome/response don't exist in the data, pass a length 2 numeric vector of the minimum and maximum bounds to properly normalize the outcome/response
`phi_reg`	Whether you are including a linear model predicting the dispersion parameter, phi, and/or for the response. If you are including models for both, pass option 'both'. If you only have an intercept for the outcome (i.e. a 1 in place of covariates), pass 'only'. If you only have intercepts for phi (such as a varying intercepts/random effects) model, pass the value "intercepts". To set priors on these intercepts, use the `extra-prior` option with the brms::set_prior function (class="sd"). If no model of any kind for phi, the default, pass 'none'.
`use_brm_multiple`	(T/F) Whether the model should use brms::brm_multiple for multiple imputation over multiple dataframes passed as a list to the `data` argument
`coef_prior_mean`	The mean of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 0.
`coef_prior_SD`	The SD of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 5, which makes the prior weakly informative on the logit scale.
`intercept_prior_mean`	The mean of the Normal distribution prior for the intercept. By default is NULL, which means the intercept receives the same prior as `coef_prior_mean`. To zero out the intercept, set this parameter to 0 and `coef_prior_SD` to a very small number (0.01 or smaller). NOTE: the default intercept in `brms` is centered (mean-subtracted) by default. To use a traditional intercept, either add `0 + Intercept` to the formula or specify `center=FALSE` in the `bf` formula function for `brms`. See `brms::brmsformula()` for more info.
`intercept_prior_SD`	The SD of the Normal distribution prior for the intercept. By default is NULL, which means the intercept receives the same prior SD as `coef_prior_SD`.
`phi_prior`	The mean parameter of the exponential prior on phi, which determines the dispersion of the beta distribution. The default is .1, which equals a mean of 10 and is thus weakly informative on the interval (0.4, 30). If the response has very low variance (i.e. tightly) clusters around a specific value, then decreasing this prior (and increasing the expected value) may be helpful. Checking the value of phi in the output of the model command will reveal if a value of 0.1 (mean of 10) is too small.
`dirichlet_prior`	A vector of three integers corresponding to the prior parameters for the dirchlet distribution (alpha parameter) governing the location of the cutpoints between the components of the response (continuous vs. degenerate). The default is 1 which puts equal probability on degenerate versus continuous responses. Likely only needs to be changed in a repeated sampling situation to stabilize the cutpoint locations across samples.
`phi_coef_prior_mean`	The mean of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 0.
`phi_coef_prior_SD`	The SD of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 5, which makes the prior weakly informative on the exponential scale.
`phi_intercept_prior_mean`	The mean of the Normal distribution prior for the phi (dispersion) regression intercept. By default is NULL, which means the intercept receives the same prior as `phi_coef_prior_mean`. To zero out the intercept, set this parameter to 0 and `phi_coef_prior_SD` to a very small number (0.01 or smaller).
`phi_intercept_prior_SD`	The SD of the Normal distribution prior for the phi (dispersion) regression intercept. By default is NULL, which means the intercept receives the same prior SD as `phi_coef_prior_SD`.
`extra_prior`	An additional prior, such as a prior for a specific regression coefficient, added to the outcome regression by passing one of the `brms` functions brms::set_prior or brms::prior_string with appropriate values.
`init`	This parameter is used to determine starting values for the Stan sampler to begin Markov Chain Monte Carlo sampling. It is set by default at 0 because the non-linear nature of beta regression means that it is possible to begin with extreme values depending on the scale of the covariates. Setting this to 0 helps the sampler find starting values. It does, on the other hand, limit the ability to detect convergence issues with Rhat statistics. If that is a concern, such as with an experimental feature of `brms`, set this to `"random"` to get more robust starting values (just be sure to scale the covariates so they are not too large in absolute size).
`make_stancode`	If `TRUE`, will pass back the Stan code for the model as a character vector rather than fitting the model.
`...`	All other arguments passed on to the `brm` function

Details

This function is a wrapper around the brms::brm function, which is a powerful Bayesian regression modeling engine using Stan. To fully explore the options available, including dynamic and hierarchical modeling, please see the documentation for the brm function above. As the ordered beta regression model is currently not available in brms natively, this modeling function allows a brms model to be fit with the ordered beta regression distribution.

For more information about the model, see the paper here: https://osf.io/preprints/socarxiv/2sx6y/.

This function allows you to set priors on the dispersion parameter, the cutpoints, and the regression coefficients (see below for options). However, to add specific priors on individual covariates, you would need to use the brms::set_prior function by specifying an individual covariate (see function documentation) and passing the result of the function call to the extra_prior argument.

This function will also automatically normalize the outcome so that it lies in the \[0,1\] interval, as required by beta regression. For furthur information, see the documentation for the normalize function.

Priors can be set on a variety of coefficients in the model, see the description of parameters coef_prior_mean and intercept_prior_mean, in addition to setting a custom prior with the extra_prior option. When setting priors on intercepts, it is important to note that by default, all intercepts in brms are centered (the means are subtracted from the data). As a result, a prior set on the default intercept will have a different interpretation than a traditional intercept (i.e. the value of the outcome when the covariates are all zero). To change this setting, use the brms::bf() function as a wrapper around the formula with the option center=FALSE to set priors on a traditional non-centered intercept.

Note that while brms also supports adding 0 + Intercept to the formula to address this issue, ordbetareg does not support this syntax. Instead, use center=FALSE as an option to brms::bf().

To learn more about how the package works, see the vignette by using the command browseVignettes(package='ordbetareg').

For more info about the distribution, see this paper: https://osf.io/preprints/socarxiv/2sx6y/

To cite the package, please cite the following paper:

Kubinec, Robert. "Ordered Beta Regression: A Parsimonious, Well-Fitting Model for Continuous Data with Lower and Upper Bounds." Political Analysis. 2022.

Value

A brms object fitted with the ordered beta regression distribution.

Examples

# load survey data that comes with the package

library(dplyr)
data("pew")

# prepare data

model_data <- select(pew,therm,
             education="F_EDUCCAT2_FINAL",
             region="F_CREGION_FINAL",
             income="F_INCOME_FINAL")

# It takes a while to fit the models. Run the code
# below if you want to load a saved fitted model from the
# package, otherwise use the model-fitting code

data("ord_fit_mean")

  
  # fit the actual model

  if(.Platform$OS.type!="windows") {

    ord_fit_mean <- ordbetareg(formula=therm ~ education + income +
      (1|region),
      data=model_data,
      cores=2,chains=2)

  }


  

# access values of the coefficients

summary(ord_fit_mean)

[Package ordbetareg version 0.7.2 Index]