R: Custom Families in 'brms' Models

custom_family {brms}

R Documentation

Custom Families in brms Models

Description

Define custom families (i.e. response distribution) for use in brms models. It allows users to benefit from the modeling flexibility of brms, while applying their self-defined likelihood functions. All of the post-processing methods for brmsfit objects can be made compatible with custom families. See vignette("brms_customfamilies") for more details. For a list of built-in families see brmsfamily.

Usage

custom_family(
  name,
  dpars = "mu",
  links = "identity",
  type = c("real", "int"),
  lb = NA,
  ub = NA,
  vars = NULL,
  loop = TRUE,
  specials = NULL,
  threshold = "flexible",
  log_lik = NULL,
  posterior_predict = NULL,
  posterior_epred = NULL,
  predict = NULL,
  fitted = NULL,
  env = parent.frame()
)

Arguments

`name`	Name of the custom family.
`dpars`	Names of the distributional parameters of the family. One parameter must be named `"mu"` and the main formula of the model will correspond to that parameter.
`links`	Names of the link functions of the distributional parameters.
`type`	Indicates if the response distribution is continuous (`"real"`) or discrete (`"int"`). This controls if the corresponding density function will be named with `<name>_lpdf` or `<name>_lpmf`.
`lb`	Vector of lower bounds of the distributional parameters. Defaults to `NA` that is no lower bound.
`ub`	Vector of upper bounds of the distributional parameters. Defaults to `NA` that is no upper bound.
`vars`	Names of variables that are part of the likelihood function without being distributional parameters. That is, `vars` can be used to pass data to the likelihood. Such arguments will be added to the list of function arguments at the end, after the distributional parameters. See `stanvar` for details about adding self-defined data to the generated Stan model. Addition arguments `vreal` and `vint` may be used for this purpose as well (see Examples below). See also `brmsformula` and `addition-terms` for more details.
`loop`	Logical; Should the likelihood be evaluated via a loop (`TRUE`; the default) over observations in Stan? If `FALSE`, the Stan code will be written in a vectorized manner over observations if possible.
`specials`	A character vector of special options to enable for this custom family. Currently for internal use only.
`threshold`	Optional threshold type for custom ordinal families. Ignored for non-ordinal families.
`log_lik`	Optional function to compute log-likelihood values of the model in R. This is only relevant if one wants to ensure compatibility with method `log_lik`.
`posterior_predict`	Optional function to compute posterior prediction of the model in R. This is only relevant if one wants to ensure compatibility with method `posterior_predict`.
`posterior_epred`	Optional function to compute expected values of the posterior predictive distribution of the model in R. This is only relevant if one wants to ensure compatibility with method `posterior_epred`.
`predict`	Deprecated alias of 'posterior_predict'.
`fitted`	Deprecated alias of 'posterior_epred'.
`env`	An `environment` in which certain post-processing functions related to the custom family can be found, if there were not directly passed to `custom_family`. This is only relevant if one wants to ensure compatibility with the methods `log_lik`, `posterior_predict`, or `posterior_epred`. By default, `env` is the environment from which `custom_family` is called.

Details

The corresponding probability density or mass Stan functions need to have the same name as the custom family. That is if a family is called myfamily, then the Stan functions should be called myfamily_lpdf or myfamily_lpmf depending on whether it defines a continuous or discrete distribution.

Value

An object of class customfamily inheriting from class brmsfamily.

Examples

## Not run: 
## demonstrate how to fit a beta-binomial model
## generate some fake data
phi <- 0.7
n <- 300
z <- rnorm(n, sd = 0.2)
ntrials <- sample(1:10, n, replace = TRUE)
eta <- 1 + z
mu <- exp(eta) / (1 + exp(eta))
a <- mu * phi
b <- (1 - mu) * phi
p <- rbeta(n, a, b)
y <- rbinom(n, ntrials, p)
dat <- data.frame(y, z, ntrials)

# define a custom family
beta_binomial2 <- custom_family(
  "beta_binomial2", dpars = c("mu", "phi"),
  links = c("logit", "log"), lb = c(NA, 0),
  type = "int", vars = "vint1[n]"
)

# define the corresponding Stan density function
stan_density <- "
  real beta_binomial2_lpmf(int y, real mu, real phi, int N) {
    return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi);
  }
"
stanvars <- stanvar(scode = stan_density, block = "functions")

# fit the model
fit <- brm(y | vint(ntrials) ~ z, data = dat,
           family = beta_binomial2, stanvars = stanvars)
summary(fit)


# define a *vectorized* custom family (no loop over observations)
# notice also that 'vint' no longer has an observation index
beta_binomial2_vec <- custom_family(
  "beta_binomial2", dpars = c("mu", "phi"),
  links = c("logit", "log"), lb = c(NA, 0),
  type = "int", vars = "vint1", loop = FALSE
)

# define the corresponding Stan density function
stan_density_vec <- "
  real beta_binomial2_lpmf(array[] int y, vector mu, real phi, array[] int N) {
    return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi);
  }
"
stanvars_vec <- stanvar(scode = stan_density_vec, block = "functions")

# fit the model
fit_vec <- brm(y | vint(ntrials) ~ z, data = dat,
           family = beta_binomial2_vec,
           stanvars = stanvars_vec)
summary(fit_vec)

## End(Not run)

[Package brms version 2.21.0 Index]