BTm {BradleyTerry2} | R Documentation |
Bradley-Terry Model and Extensions
Description
Fits Bradley-Terry models for pair comparison data, including models with structured scores, order effect and missing covariate data. Fits by either maximum likelihood or maximum penalized likelihood (with Jeffreys-prior penalty) when abilities are modelled exactly, or by penalized quasi-likelihood when abilities are modelled by covariates.
Usage
BTm(
outcome = 1,
player1,
player2,
formula = NULL,
id = "..",
separate.ability = NULL,
refcat = NULL,
family = "binomial",
data = NULL,
weights = NULL,
subset = NULL,
na.action = NULL,
start = NULL,
etastart = NULL,
mustart = NULL,
offset = NULL,
br = FALSE,
model = TRUE,
x = FALSE,
contrasts = NULL,
...
)
Arguments
outcome |
the binomial response: either a numeric vector, a factor in which the first level denotes failure and all others success, or a two-column matrix with the columns giving the numbers of successes and failures. |
player1 |
either an ID factor specifying the first player in each
contest, or a data.frame containing such a factor and possibly other
contest-level variables that are specific to the first player. If given in a
data.frame, the ID factor must have the name given in the |
player2 |
an object corresponding to that given in |
formula |
a formula with no left-hand-side, specifying the model for player ability. See details for more information. |
id |
the name of the ID factor. |
separate.ability |
(if |
refcat |
(if |
family |
a description of the error distribution and link function to
be used in the model. Only the binomial family is implemented, with
either |
data |
an optional object providing data required by the model. This
may be a single data frame of contest-level data or a list of data frames.
Names of data frames are ignored unless they refer to data frames specified
by |
weights |
an optional numeric vector of ‘prior weights’. |
subset |
an optional logical or numeric vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when any
contest-level variables contain |
start |
a vector of starting values for the fixed effects. |
etastart |
a vector of starting values for the linear predictor. |
mustart |
a vector of starting values for the vector of means. |
offset |
an optional offset term in the model. A vector of length equal to the number of contests. |
br |
logical. If |
model |
logical: whether or not to return the model frame. |
x |
logical: whether or not to return the design matrix for the fixed effects. |
contrasts |
an optional list specifying contrasts for the factors in
|
... |
other arguments for fitting function (currently either
|
Details
In each comparison to be modelled there is a 'first player' and a 'second player' and it is assumed that one player wins while the other loses (no allowance is made for tied comparisons).
The countsToBinomial()
function is provided to convert a
contingency table of wins into a data frame of wins and losses for each pair
of players.
The formula
argument specifies the model for player ability and
applies to both the first player and the second player in each contest. If
NULL
a separate ability is estimated for each player, equivalent to
setting formula = reformulate(id)
.
Contest-level variables can be specified in the formula in the usual manner,
see formula()
. Player covariates should be included as variables
indexed by id
, see examples. Thus player covariates must be ordered
according to the levels of the ID factor.
If formula
includes player covariates and there are players with
missing values over these covariates, then a separate ability will be
estimated for those players.
When player abilities are modelled by covariates, then random player effects
should be added to the model. These should be specified in the formula using
the vertical bar notation of lme4::lmer()
, see examples.
When specified, it is assumed that random player effects arise from a
N(0,
\sigma^2)
distribution and
model parameters, including \sigma
, are estimated using PQL
(Breslow and Clayton, 1993) as implemented in the glmmPQL()
function.
Value
An object of class c("BTm", "x")
, where "x"
is the
class of object returned by the model fitting function (e.g. glm
).
Components are as for objects of class "x"
, with additionally
id |
the |
separate.ability |
the
|
refcat |
the |
player1 |
a data frame for the first player containing the ID factor and any player-specific contest-level variables. |
player2 |
a
data frame corresponding to that for |
assign |
a numeric vector indicating which coefficients correspond to which terms in the model. |
term.labels |
labels for the model terms. |
random |
for models with random effects, the design matrix for the random effects. |
Author(s)
Heather Turner, David Firth
References
Agresti, A. (2002) Categorical Data Analysis (2nd ed). New York: Wiley.
Firth, D. (1992) Bias reduction, the Jeffreys prior and GLIM. In Advances in GLIM and Statistical Modelling, Eds. Fahrmeir, L., Francis, B. J., Gilchrist, R. and Tutz, G., pp91–100. New York: Springer.
Firth, D. (1993) Bias reduction of maximum likelihood estimates. Biometrika 80, 27–38.
Firth, D. (2005) Bradley-Terry models in R. Journal of Statistical Software, 12(1), 1–12.
Stigler, S. (1994) Citation patterns in the journals of statistics and probability. Statistical Science 9, 94–108.
Turner, H. and Firth, D. (2012) Bradley-Terry models in R: The BradleyTerry2 package. Journal of Statistical Software, 48(9), 1–21.
See Also
countsToBinomial()
, glmmPQL()
,
BTabilities()
, residuals.BTm()
,
add1.BTm()
, anova.BTm()
Examples
########################################################
## Statistics journal citation data from Stigler (1994)
## -- see also Agresti (2002, p448)
########################################################
## Convert frequencies to success/failure data
citations.sf <- countsToBinomial(citations)
names(citations.sf)[1:2] <- c("journal1", "journal2")
## First fit the "standard" Bradley-Terry model
citeModel <- BTm(cbind(win1, win2), journal1, journal2, data = citations.sf)
## Now the same thing with a different "reference" journal
citeModel2 <- update(citeModel, refcat = "JASA")
BTabilities(citeModel2)
##################################################################
## Now an example with an order effect -- see Agresti (2002) p438
##################################################################
data(baseball) # start with baseball data as provided by package
## Simple Bradley-Terry model, ignoring home advantage:
baseballModel1 <- BTm(cbind(home.wins, away.wins), home.team, away.team,
data = baseball, id = "team")
## Now incorporate the "home advantage" effect
baseball$home.team <- data.frame(team = baseball$home.team, at.home = 1)
baseball$away.team <- data.frame(team = baseball$away.team, at.home = 0)
baseballModel2 <- update(baseballModel1, formula = ~ team + at.home)
## Compare the fit of these two models:
anova(baseballModel1, baseballModel2)
##
## For a more elaborate example with both player-level and contest-level
## predictor variables, see help(chameleons).
##