R: Binary Logit Models for Nested Dichotomies

nestedLogit {nestedLogit}

R Documentation

Binary Logit Models for Nested Dichotomies

Description

Fit a related set of binary logit models via the glm function to nested dichotomies, comprising a model for the polytomy. A polytomous response with m categories can be analyzed using m-1 binary logit comparisons. When these comparisons are nested, the m-1 sub-models are statistically independent. Therefore, the likelihood chi-square statistics for the sub-models are additive and give overall tests for a model for the polytomy. This method was introduced by Fienberg (1980),and subsequently illustrated by Fox(2016) and Friendly & Meyer (2016).

dichotomy and logits are helper functions to construct the dichotomies.

continuationLogits constructs a set of m-1 logit comparisons, called continuation logits, for an ordered response. With m=4 levels, say, A, B, C, D, considered low to high: The first contrasts B, C, D against A. The second ignores A and contrasts C, D against B. The second ignores A, B and contrasts D against C.

Usage

nestedLogit(formula, dichotomies, data, subset = NULL, contrasts = NULL, ...)

logits(...)

dichotomy(...)

continuationLogits(levels, names, prefix = "above_")

Arguments

`formula`	a model formula with the polytomous response on the left-hand side and the usual linear-model-like specification on the right-hand side.
`dichotomies`	specification of the logits for the nested dichotomies, constructed by the `logits` and `dichotomy` functions, or `continuationLogits`. Alternatively, the `dichotomies` can be specified as a nested (i.e., recursive) list, the elements of which can be given optional names. See Details.
`data`	a data frame with the data for the model; unlike in most statistical modeling functions, the `data` argument is required. Cases with `NA`s in any of the variables appearing in the model formula will be removed with a Note message.
`subset`	a character string specifying an expression to fit the model to a subset of the data; the default, `NULL`, uses the full data set.
`contrasts`	an optional list of contrast specification for specific factors in the model; see `lm` for details.
`...`	for `nestedLogit`, optional named arguments to be passed to `glm`; for `logits`, definitions of the nested logits—with each named argument specifying a dichotomy; for `dichotomy`, two character vectors giving the levels defining the dichotomy; the vectors can optionally be named.
`levels`	A character vector of set of levels of the variables or a number specifying the numbers of levels (in which case, uppercase letters will be use for the levels).
`names`	Names to be assigned to the dichotomies; if absent, names will be generated from the levels.
`prefix`	a character string (default: `"above_"`) used as a prefix to the names of the continuation dichotomies.

Details

A dichotomy for a categorical variable is a comparison of one subset of levels against another subset. A set of dichotomies is nested, if after an initial dichotomy, all subsequent ones are within the groups of levels lumped together in earlier ones. Nested dichotomies correspond to a binary tree of the successive divisions.

For example, for a 3-level response, a first dichotomy could be {A}, {B, C} and then the second one would be just {B}, {C}. Note that in the second dichotomy, observations with response A are treated as NA.

The function dichotomy constructs a single dichotomy in the required form, which is a list of length 2 containing two character vectors giving the levels defining the dichotomy. The function logits is used to create the set of dichotomies for a response factor. Alternatively, the nested dichotomies can be specified more compactly as a nested (i.e., recursive) list with optionally named elements; for example, list(air="plane", ground=list(public=list("train", "bus"), private="car")).

The function continuationLogits provides a convenient way to generate all dichotomies for an ordered response. For an ordered response with m=4 levels, say, A, B, C, D, considered low to high: The dichotomy first contrasts B, C, D against A. The second ignores A and contrasts C, D against B. The second ignores A, B and contrasts D against C.

Value

nestedLogit returns an object of class "nestedLogit" containing the following elements:

models, a named list of (normally) m - 1 "glm" objects, each a binary logit model for one of the m - 1 nested dichotomies representing the m-level response.
formula, the model formula for the nested logit models.
dichotomies, the "dichotomies" object defining the nested dichotomies for the model.
data.name, the name of the data set to which the model is fit, of class "name".
data, the data set to which the model is fit.
subset, a character representation of the subset argument or "NULL" if the argument isn't specified.
contrasts, the contrasts argument or NULL if the argument isn't specified.
contrasts.print a character representation of the contrasts argument or "NULL" if the argument isn't specified.

logits and continuationLogits return objects of class "dichotomies" and c("continuationDichotomies" "dichotomies"), respectively, which are two-elements lists, each element containing a list of two character vectors representing a dichotomy. dichotomy returns a list of two character vectors representing a dichotomy.

Author(s)

John Fox

References

S. Fienberg (1980). The Analysis of Cross-Classified Categorical Data, 2nd Edition, MIT Press, Section 6.6.

J. Fox (2016), Applied Linear Regression and Generalized Linear Models, 3rd Edition, Sage, Section 14.2.2.

J. Fox and S. Weisberg (2011), An R Companion to Applied Regression, 2nd Edition, Sage, Section 5.8.

M. Friendly and D. Meyers (2016), Discrete Data Analysis with R, CRC Press, Section 8.2.

Examples

data("Womenlf", package = "carData")

  #' Use `logits()` and `dichotomy()` to specify the comparisons of interest
  comparisons <- logits(work=dichotomy("not.work",
                                       working=c("parttime", "fulltime")),
                        full=dichotomy("parttime", "fulltime"))
  print(comparisons)

  m <- nestedLogit(partic ~ hincome + children,
                   dichotomies = comparisons,
                   data=Womenlf)
  print(summary(m))
  print(car::Anova(m))
  coef(m)

  # equivalent;
  nestedLogit(partic ~ hincome + children,
              dichotomies = list("not.work",
                                 working=list("parttime", "fulltime")),
              data=Womenlf)

  # get predicted values
  new <- expand.grid(hincome=seq(0, 45, length=10),
                     children=c("absent", "present"))
  pred.nested <- predict(m, new)

  # plot
  op <- par(mfcol=c(1, 2), mar=c(4, 4, 3, 1) + 0.1)
  plot(m, "hincome", list(children="absent"),
       xlab="Husband's Income", legend=FALSE)
  plot(m, "hincome", list(children="present"),
       xlab="Husband's Income")
  par(op)


  continuationLogits(c("none", "gradeschool", "highschool", "college"))
  continuationLogits(4)

[Package nestedLogit version 0.3.2 Index]