mlogit {mlogit} | R Documentation |
Multinomial logit model
Description
Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.
Usage
mlogit(
formula,
data,
subset,
weights,
na.action,
start = NULL,
alt.subset = NULL,
reflevel = NULL,
nests = NULL,
un.nest.el = FALSE,
unscaled = FALSE,
heterosc = FALSE,
rpar = NULL,
probit = FALSE,
R = 40,
correlation = FALSE,
halton = NULL,
random.nb = NULL,
panel = FALSE,
estimate = TRUE,
seed = 10,
...
)
Arguments
formula |
a symbolic description of the model to be estimated, |
data |
the data: an |
subset |
an optional vector specifying a subset of
observations for |
weights |
an optional vector of weights, |
na.action |
a function which indicates what should happen when
the data contains |
start |
a vector of starting values, |
alt.subset |
a vector of character strings containing the subset of alternative on which the model should be estimated, |
reflevel |
the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0), |
nests |
a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest, |
un.nest.el |
a boolean, if |
unscaled |
a boolean, if |
heterosc |
a boolean, if |
rpar |
a named vector whose names are the random parameters
and values the distribution : |
probit |
if |
R |
the number of function evaluation for the gaussian
quadrature method used if |
correlation |
only relevant if |
halton |
only relevant if |
random.nb |
only relevant if |
panel |
only relevant if |
estimate |
a boolean indicating whether the model should be
estimated or not: if not, the |
seed |
the seed to use for random numbers (for mixed logit and probit models), |
... |
further arguments passed to |
Details
For how to use the formula argument, see Formula()
.
The data
argument may be an ordinary data.frame
. In this case,
some supplementary arguments should be provided and are passed to
mlogit.data()
. Note that it is not necessary to indicate the
choice argument as it is deduced from the formula.
The model is estimated using the mlogit.optim()
.
function.
The basic multinomial logit model and three important extentions of this model may be estimated.
If heterosc=TRUE
, the heteroscedastic logit model is estimated.
J - 1
extra coefficients are estimated that represent the scale
parameter for J - 1
alternatives, the scale parameter for the
reference alternative being normalized to 1. The probabilities
don't have a closed form, they are estimated using a gaussian
quadrature method.
If nests
is not NULL
, the nested logit model is estimated.
If rpar
is not NULL
, the random parameter model is estimated.
The probabilities are approximated using simulations with R
draws
and halton sequences are used if halton
is not
NULL
. Pseudo-random numbers are drawns from a standard normal and
the relevant transformations are performed to obtain numbers drawns
from a normal, log-normal, censored-normal or uniform
distribution. If correlation = TRUE
, the correlation between the
random parameters are taken into account by estimating the
components of the cholesky decomposition of the covariance
matrix. With G random parameters, without correlation G standard
deviations are estimated, with correlation G * (G + 1) /2
coefficients are estimated.
Value
An object of class "mlogit"
, a list with elements:
coefficients: the named vector of coefficients,
logLik: the value of the log-likelihood,
hessian: the hessian of the log-likelihood at convergence,
gradient: the gradient of the log-likelihood at convergence,
call: the matched call,
est.stat: some information about the estimation (time used, optimisation method),
freq: the frequency of choice,
residuals: the residuals,
fitted.values: the fitted values,
formula: the formula (a
Formula
object),expanded.formula: the formula (a
formula
object),model: the model frame used,
index: the index of the choice and of the alternatives.
Author(s)
Yves Croissant
References
McFadden D (1973). “Conditional Logit Analysis of Qualitative Choice Behaviour.” In Zarembka P (ed.), Frontiers in Econometrics, 105-142. Academic Press New York, New York, NY, USA.
McFadden D (1974). “The measurement of urban travel demand.” Journal of Public Economics, 3(4), 303 - 328. ISSN 0047-2727, https://www.sciencedirect.com/science/article/pii/0047272774900036.
Train K (2009). Discrete Choice Methods with Simulation. Cambridge University Press. https://EconPapers.repec.org/RePEc:cup:cbooks:9780521766555.
See Also
mlogit.data()
to shape the data. nnet::multinom()
from
package nnet
performs the estimation of the multinomial logit
model with individual specific variables. mlogit.optim()
details about the optimization function.
Examples
## Cameron and Trivedi's Microeconometrics p.493 There are two
## alternative specific variables : price and catch one individual
## specific variable (income) and four fishing mode : beach, pier, boat,
## charter
data("Fishing", package = "mlogit")
Fish <- dfidx(Fishing, varying = 2:9, shape = "wide", choice = "mode")
## a pure "conditional" model
summary(mlogit(mode ~ price + catch, data = Fish))
## a pure "multinomial model"
summary(mlogit(mode ~ 0 | income, data = Fish))
## which can also be estimated using multinom (package nnet)
summary(nnet::multinom(mode ~ income, data = Fishing))
## a "mixed" model
m <- mlogit(mode ~ price + catch | income, data = Fish)
summary(m)
## same model with charter as the reference level
m <- mlogit(mode ~ price + catch | income, data = Fish, reflevel = "charter")
## same model with a subset of alternatives : charter, pier, beach
m <- mlogit(mode ~ price + catch | income, data = Fish,
alt.subset = c("charter", "pier", "beach"))
## model on unbalanced data i.e. for some observations, some
## alternatives are missing
# a data.frame in wide format with two missing prices
Fishing2 <- Fishing
Fishing2[1, "price.pier"] <- Fishing2[3, "price.beach"] <- NA
mlogit(mode ~ price + catch | income, Fishing2, shape = "wide", varying = 2:9)
# a data.frame in long format with three missing lines
data("TravelMode", package = "AER")
Tr2 <- TravelMode[-c(2, 7, 9),]
mlogit(choice ~ wait + gcost | income + size, Tr2)
## An heteroscedastic logit model
data("TravelMode", package = "AER")
hl <- mlogit(choice ~ wait + travel + vcost, TravelMode, heterosc = TRUE)
## A nested logit model
TravelMode$avincome <- with(TravelMode, income * (mode == "air"))
TravelMode$time <- with(TravelMode, travel + wait)/60
TravelMode$timeair <- with(TravelMode, time * I(mode == "air"))
TravelMode$income <- with(TravelMode, income / 10)
# Hensher and Greene (2002), table 1 p.8-9 model 5
TravelMode$incomeother <- with(TravelMode, ifelse(mode %in% c('air', 'car'), income, 0))
nl <- mlogit(choice ~ gcost + wait + incomeother, TravelMode,
nests = list(public = c('train', 'bus'), other = c('car','air')))
# same with a comon nest elasticity (model 1)
nl2 <- update(nl, un.nest.el = TRUE)
## a probit model
## Not run:
pr <- mlogit(choice ~ wait + travel + vcost, TravelMode, probit = TRUE)
## End(Not run)
## a mixed logit model
## Not run:
rpl <- mlogit(mode ~ price + catch | income, Fishing, varying = 2:9,
rpar = c(price= 'n', catch = 'n'), correlation = TRUE,
alton = NA, R = 50)
summary(rpl)
rpar(rpl)
cor.mlogit(rpl)
cov.mlogit(rpl)
rpar(rpl, "catch")
summary(rpar(rpl, "catch"))
## End(Not run)
# a ranked ordered model
data("Game", package = "mlogit")
g <- mlogit(ch ~ own | hours, Game, varying = 1:12, ranked = TRUE,
reflevel = "PC", idnames = c("chid", "alt"))