R: Bayesian Model Selection with Latent Group-Based Regression...

ms_slgf {slgf}

R Documentation

Bayesian Model Selection with Latent Group-Based Regression Effects and Heteroscedasticity

Description

ms_slgf Implements the model selection method proposed by (Metzger and Franck 2019).

Usage

ms_slgf(
  dataf,
  response,
  lgf_beta,
  min_levels_beta = 1,
  lgf_Sigma,
  min_levels_Sigma = 1,
  same_scheme = TRUE,
  usermodels,
  het = rep(0, length(usermodels)),
  prior = "flat",
  m0 = NULL
)

Arguments

`dataf`	A data frame containing a continuous response, at least one categorical predictor, and any other covariates of interest. This data frame should not contain column names with the character string `group`.
`response`	A character string indicating the column of `dataf` that contains the response.
`lgf_beta`	An optional character string indicating the column of 'dataf' that contains the suspected latent grouping factor (SLGF) for the regression effects.
`min_levels_beta`	A numeric value indicating the minimum number of levels of 'lgf_beta' that can comprise a group. Defaults to 1.
`lgf_Sigma`	An optional character string indicating the column of 'dataf' that contains the suspected latent grouping factor (SLGF) for the residual variances.
`min_levels_Sigma`	A numeric value indicating the minimum number of levels of 'lgf_Sigma' that can comprise a group. Defaults to 1.
`same_scheme`	A Boolean operator indicating whether the schemes for 'lgf_beta' and 'lgf_Sigma' must be the same.
`usermodels`	A list of length `M` where each element contains a string of R class `formula` or `character` indicating the models to consider. The term `group` should be used to replace the name of the SLGF in models with group-based regression effects.
`het`	A vector of 0s and 1s of length `M`. If the mth element of `het` is 0, then the mth model of `usermodels` is considered in a homoscedastic context only; if the mth element of `het` is 1, the mth model of `usermodels` is considered in both homoscedastic and heteroscedastic contexts.
`prior`	A character string `"flat"` or `"zs"` indicating whether to implement the flat or Zellner-Siow mixture g-prior on regression effects, respectively. Defaults to `"flat"`.
`m0`	An integer value indicating the minimum training sample size. Defaults to NULL. If no value is provided, the lowest value that leads to convergence for all considered posterior model probabilities will be used. If the value provided is too low for convergence, it will be increased automatically.

Value

ms_slgf returns a list of five elements if the flat prior is used, and six elements if the Zellner-Siow mixture g-prior is used:
1) models, an M by 7 matrix where columns contain the model selection results and information for each model, including:
- Model, the formula associated with each model;
- Scheme.beta, the grouping scheme associated with the fixed effects;
- Scheme.Sigma, the grouping scheme associated with the variances;
- Log-Marginal, the fractional log-marginal likelihood associated with each model;
- FmodProb, the fractional posterior probability associated with each model;
- ModPrior, the prior assigned to each model;
- Cumulative, the cumulative fractional posterior probability associated with a given model and the previous models;
2) class_probabilities, a vector containing cumulative posterior probabilities associated with each model class;
3) coefficients, MLEs for each model's regression effects;
4) variances, MLEs based on concentrated likelihood for each model's variance(s);
5) gs, MLEs based on concentrated likelihood for each model's g; only included if prior="zs".

Author(s)

Thomas A. Metzger and Christopher T. Franck

References

Metzger TA, Franck CT (2019). “Detection of latent heteroscedasticity and group-based regression effects in linear models via Bayesian model selection.” arXiv e-prints.

Examples

# Analyze the smell and textile data sets.

library(numDeriv)


data(smell)
out_smell <- ms_slgf(dataf = smell, response = "olf", het=c(1,1),
                     lgf_beta = "agecat", lgf_Sigma = "agecat",
                     same_scheme=TRUE, min_levels_beta=1, min_levels_Sigma=1,
                     usermodels = list("olf~agecat", "olf~group"), m0=4)
out_smell$models[1:5,]
out_smell$coefficients[[46]]
out_smell$variances[[46]]

# textile data set
data(textile)
out_textile <- ms_slgf(dataf = textile, response = "strength",
                     lgf_beta = "starch", lgf_Sigma = "starch",
                     same_scheme=FALSE, min_levels_beta=1, min_levels_Sigma=1,
                     usermodels = list("strength~film+starch", "strength~film*starch",
                                       "strength~film+group", "strength~film*group"),
                     het=c(1,1,1,1), prior="flat", m0=8)
out_textile$models[1:5,c(1,2,3,5)]
out_textile$class_probabilities
out_textile$coefficients[31]
out_textile$variances[31]

[Package slgf version 2.0.0 Index]