discrete_cm {GLMcat}R Documentation

Discrete Choice Models

Description

Family of models for Discrete Choice. Fits discrete choice models which require data in long form. For each individual (or decision maker), there are multiple observations (rows), one for each of the alternatives the individual could have chosen. A group of observations of the same individual is a "case". It is important to note that each case represents a single statistical observation although it comprises multiple observations.

Usage

discrete_cm(
  formula,
  case_id,
  alternatives,
  reference,
  alternative_specific = NA,
  data,
  cdf = list(),
  intercept = "standard",
  normalization = 1,
  control = list(),
  na.action = "na.omit",
  find_nu = FALSE
)

Arguments

formula

a symbolic description of the model to be fit. An expression of the form y ~ predictors is interpreted as a specification that the response y is modeled by a linear predictor specified symbolically by model. A particularity for the formula is that for the case-specific variables, the user can define a specific effect for a category (in the parameter 'alternative_specific').

case_id

a string with the name of the column that identifies each case.

alternatives

a string with the name of the column that identifies the vector of alternatives the individual could have chosen.

reference

a string indicating the reference category.

alternative_specific

a character vector with the name of the explanatory variables that are different for each case, these are the alternative-specific variables. By default, the case-specific variables are the explanatory variables that are not identified here but are part of the formula.

data

a dataframe (in long format) object in R, with the dependent variable as a factor.

cdf

a parameter specifying the inverse distribution function to be used as part of the link function. If the distribution has no parameters to specify, it should be entered as a string indicating the name. The default value is 'logistic'. If there are parameters to specify, a list must be entered. For example, for Student's distribution, it would be 'list("student", df=2)'. For the non-central distribution of Student, it would be 'list("noncentralt", df=2, mu=1)'.

intercept

if set to "conditional", the design will be equivalent to the conditional logit model.

normalization

the quantile to use for the normalization of the estimated coefficients where the logistic distribution is used as the base cumulative distribution function.

control

a list specifying additional control parameters. - 'maxit': the maximum number of iterations for the Fisher scoring algorithm. - 'epsilon': a double value to fix the epsilon value. - 'beta_init': an appropriately sized vector for the initial iteration of the algorithm.

na.action

an argument to handle missing data. Available options are na.omit, na.fail, and na.exclude. It comes from the stats library and does not include the na.pass option.

find_nu

a logical argument to indicate whether the user intends to utilize the Student CDF and seeks an optimization algorithm to identify an optimal degrees of freedom setting for the model.

Details

Family of models for Discrete Choice

Note

For these models, it is not allowed to exclude the intercept.

Examples

library(GLMcat)
data(TravelChoice)

discrete_cm(formula = choice ~ hinc + gc + invt,
            case_id = "indv", alternatives = "mode", reference = "air",
            data = TravelChoice,
            cdf = "logistic")

#' Model with alternative specific effects for gc and invt:
discrete_cm(formula = choice ~ hinc + gc + invt,
            case_id = "indv", alternatives = "mode", reference = "air",
            data = TravelChoice, alternative_specific = c("gc", "invt"),
            cdf = "logistic")

 #' A more specific design was studied by Louvierte et al. (2000, p. 157) and Greene (2003, p. 730).
 #' These analyses set the effect of the variables hinc and psize exclusively for the category air
discrete_cm(formula = choice ~ hinc[air] + psize[air] + gc + ttme,
            case_id = "indv",
            alternatives = "mode",
            reference = "car",
            alternative_specific = c("gc", "ttme"),
            data = TravelChoice)

[Package GLMcat version 0.2.6 Index]