ordbetareg {ordbetareg}R Documentation

Fit Ordered Beta Regression Model

Description

This function allows you to estimate an ordered beta regression model via a formula syntax.

The ordbetareg package is essentially a wrapper around brms that enables the ordered beta regression model to be fit. This model has advantages over other alternatives for continous data with upper and lower bounds, such as survey sliders, indexes, dose-response relationships, and visual analog scales (among others). The package allows for all of the many brms regression modeling functions to be used with the ordered beta regression distribution.

Usage

ordbetareg(
  formula = NULL,
  data = NULL,
  true_bounds = NULL,
  phi_reg = "none",
  use_brm_multiple = FALSE,
  coef_prior_mean = 0,
  coef_prior_SD = 5,
  intercept_prior_mean = NULL,
  intercept_prior_SD = NULL,
  phi_prior = 0.1,
  dirichlet_prior = c(1, 1, 1),
  phi_coef_prior_mean = 0,
  phi_coef_prior_SD = 5,
  phi_intercept_prior_mean = NULL,
  phi_intercept_prior_SD = NULL,
  extra_prior = NULL,
  init = "0",
  make_stancode = FALSE,
  ...
)

Arguments

formula

Either an R formula in the form response/DV ~ var1 + var2 etc. or formula object as created/called by the brms brms::bf function. *Please avoid using 0 or Intercept in the formula definition.

data

An R data frame or tibble containing the variables in the formula

true_bounds

If the true bounds of the outcome/response don't exist in the data, pass a length 2 numeric vector of the minimum and maximum bounds to properly normalize the outcome/response

phi_reg

Whether you are including a linear model predicting the dispersion parameter, phi, and/or for the response. If you are including models for both, pass option 'both'. If you only have an intercept for the outcome (i.e. a 1 in place of covariates), pass 'only'. If you only have intercepts for phi (such as a varying intercepts/random effects) model, pass the value "intercepts". To set priors on these intercepts, use the extra-prior option with the brms::set_prior function (class="sd"). If no model of any kind for phi, the default, pass 'none'.

use_brm_multiple

(T/F) Whether the model should use brms::brm_multiple for multiple imputation over multiple dataframes passed as a list to the data argument

coef_prior_mean

The mean of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 0.

coef_prior_SD

The SD of the Normal distribution prior on the regression coefficients (for predicting the mean of the response). Default is 5, which makes the prior weakly informative on the logit scale.

intercept_prior_mean

The mean of the Normal distribution prior for the intercept. By default is NULL, which means the intercept receives the same prior as coef_prior_mean. To zero out the intercept, set this parameter to 0 and coef_prior_SD to a very small number (0.01 or smaller). NOTE: the default intercept in brms is centered (mean-subtracted) by default. To use a traditional intercept, either add 0 + Intercept to the formula or specify center=FALSE in the bf formula function for brms. See brms::brmsformula() for more info.

intercept_prior_SD

The SD of the Normal distribution prior for the intercept. By default is NULL, which means the intercept receives the same prior SD as coef_prior_SD.

phi_prior

The mean parameter of the exponential prior on phi, which determines the dispersion of the beta distribution. The default is .1, which equals a mean of 10 and is thus weakly informative on the interval (0.4, 30). If the response has very low variance (i.e. tightly) clusters around a specific value, then decreasing this prior (and increasing the expected value) may be helpful. Checking the value of phi in the output of the model command will reveal if a value of 0.1 (mean of 10) is too small.

dirichlet_prior

A vector of three integers corresponding to the prior parameters for the dirchlet distribution (alpha parameter) governing the location of the cutpoints between the components of the response (continuous vs. degenerate). The default is 1 which puts equal probability on degenerate versus continuous responses. Likely only needs to be changed in a repeated sampling situation to stabilize the cutpoint locations across samples.

phi_coef_prior_mean

The mean of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 0.

phi_coef_prior_SD

The SD of the Normal distribution prior on the regression coefficients for predicting phi, the dispersion parameter. Only useful if a linear model is being fit to phi. Default is 5, which makes the prior weakly informative on the exponential scale.

phi_intercept_prior_mean

The mean of the Normal distribution prior for the phi (dispersion) regression intercept. By default is NULL, which means the intercept receives the same prior as phi_coef_prior_mean. To zero out the intercept, set this parameter to 0 and phi_coef_prior_SD to a very small number (0.01 or smaller).

phi_intercept_prior_SD

The SD of the Normal distribution prior for the phi (dispersion) regression intercept. By default is NULL, which means the intercept receives the same prior SD as phi_coef_prior_SD.

extra_prior

An additional prior, such as a prior for a specific regression coefficient, added to the outcome regression by passing one of the brms functions brms::set_prior or brms::prior_string with appropriate values.

init

This parameter is used to determine starting values for the Stan sampler to begin Markov Chain Monte Carlo sampling. It is set by default at 0 because the non-linear nature of beta regression means that it is possible to begin with extreme values depending on the scale of the covariates. Setting this to 0 helps the sampler find starting values. It does, on the other hand, limit the ability to detect convergence issues with Rhat statistics. If that is a concern, such as with an experimental feature of brms, set this to "random" to get more robust starting values (just be sure to scale the covariates so they are not too large in absolute size).

make_stancode

If TRUE, will pass back the Stan code for the model as a character vector rather than fitting the model.

...

All other arguments passed on to the brm function

Details

This function is a wrapper around the brms::brm function, which is a powerful Bayesian regression modeling engine using Stan. To fully explore the options available, including dynamic and hierarchical modeling, please see the documentation for the brm function above. As the ordered beta regression model is currently not available in brms natively, this modeling function allows a brms model to be fit with the ordered beta regression distribution.

For more information about the model, see the paper here: https://osf.io/preprints/socarxiv/2sx6y/.

This function allows you to set priors on the dispersion parameter, the cutpoints, and the regression coefficients (see below for options). However, to add specific priors on individual covariates, you would need to use the brms::set_prior function by specifying an individual covariate (see function documentation) and passing the result of the function call to the extra_prior argument.

This function will also automatically normalize the outcome so that it lies in the \[0,1\] interval, as required by beta regression. For furthur information, see the documentation for the normalize function.

Priors can be set on a variety of coefficients in the model, see the description of parameters coef_prior_mean and intercept_prior_mean, in addition to setting a custom prior with the extra_prior option. When setting priors on intercepts, it is important to note that by default, all intercepts in brms are centered (the means are subtracted from the data). As a result, a prior set on the default intercept will have a different interpretation than a traditional intercept (i.e. the value of the outcome when the covariates are all zero). To change this setting, use the brms::bf() function as a wrapper around the formula with the option center=FALSE to set priors on a traditional non-centered intercept.

Note that while brms also supports adding 0 + Intercept to the formula to address this issue, ordbetareg does not support this syntax. Instead, use center=FALSE as an option to brms::bf().

To learn more about how the package works, see the vignette by using the command browseVignettes(package='ordbetareg').

For more info about the distribution, see this paper: https://osf.io/preprints/socarxiv/2sx6y/

To cite the package, please cite the following paper:

Kubinec, Robert. "Ordered Beta Regression: A Parsimonious, Well-Fitting Model for Continuous Data with Lower and Upper Bounds." Political Analysis. 2022.

Value

A brms object fitted with the ordered beta regression distribution.

Examples

# load survey data that comes with the package

library(dplyr)
data("pew")

# prepare data

model_data <- select(pew,therm,
             education="F_EDUCCAT2_FINAL",
             region="F_CREGION_FINAL",
             income="F_INCOME_FINAL")

# It takes a while to fit the models. Run the code
# below if you want to load a saved fitted model from the
# package, otherwise use the model-fitting code

data("ord_fit_mean")

  
  # fit the actual model

  if(.Platform$OS.type!="windows") {

    ord_fit_mean <- ordbetareg(formula=therm ~ education + income +
      (1|region),
      data=model_data,
      cores=2,chains=2)

  }


  

# access values of the coefficients

summary(ord_fit_mean)

[Package ordbetareg version 0.7.2 Index]