mvglmmRank {mvglmmRank}R Documentation

mvglmmRank

Description

This function fits a (multivariate) generalized linear mixed model to team scores and/or win/loss indicators.

Usage

mvglmmRank(game.data, method = "PB0", first.order = FALSE, 
home.field = TRUE, max.iter.EM = 1000, tol1 = 1e-04, 
tol2 = 1e-04, tolFE = 0, tol.n = 1e-07, verbose = TRUE, OT.flag = FALSE, 
Hessian = FALSE, REML.N=TRUE)

Arguments

game.data

a data frame that contains a column "home" of team names, a column "away" of team names, a column "home.response" containing the scores (or other response) of the "home" teams, a column "away.response" containing the scores (or other response) of the "away" teams, (optionally) a column "binary.response" that contains a column of binary responses (0's and 1's), and (optionally) a column "neutral.site" which takes the value 1 for neutral site games and 0 otherwise. NOTE: If game.data does not contain a "binary.response" column, then an indicator will be created for whether the home team won. NOTE: For neutral site games, randomly assign the teams as "home" or "away". As noted below, the data frame may optionally contain a column, OT, which indicates how many overtime periods were played. NOTE: the game.data$OT column should not contain missing data. If there was no overtime, specify "none" or 0.

method

a character (remember to use quotation marks!). Choices are "N", "P0", "P1", "B", "NB", "PB0", "PB1", "NB.mov", or "N.mov". "N" indicates the scores are fit with a normal distribution with intra-game correlation between the home and away teams accounted for in an unstructured 2x2 error covariance matrix. "P" indicates the scores are fit with a Poisson distribution. "B" indicates the home win/loss indicators are fit using a binary distribution with a probit link. The presence of a "1" with a "P" indicates potential intra-game correlation is modeled with an additional game-level random effect. A "0" indicates no such random effects are included. "NB.mov" fits the margin of victory of the "home" team (under an assumed normal distribution) jointly with the binary win/loss indicators. "N.mov" fits only the margin of victory of the "home" team (under an assumed normal distribution). See the NOTES section below for further details.

first.order

logical. TRUE requests that only a first order Laplace approximation be used, FALSE requests a fully exponential Laplace approximation. See the references.

home.field

logical. TRUE requests that seperate home and away mean scores be modeled (along with a mean neutral site score, if applicable) along with a single home field effect in the binary model. FALSE requests only a single mean be calculated for the scores, and no fixed effects are fit for the binary win/loss indicators. Note that the estimator for the home field effect may be biased, depending on the scheduling structure; see the Karl and Zimmerman (2021) reference.

max.iter.EM

a number giving the maximum number of EM iterations.

tol1

refers to the maximum relative change in parameters between iterations. This is the convergence criterion for the first order Laplace approximation. The first order Laplace approximation runs until tol1 signals, at which point the fully exponential corrections for the random effects vector begin

tol2

The fully exponential iterations run until the maximum relative change in model paramters is less than tol2. N/A when first.order==TRUE.

tolFE

intermediate convergence criterion for fully exponential approximations. The algorithm runs with the fully exponential corrections only to the random effects vector until tolFE signals (maximum relative change in parameters). After this, the fully exponential corrections for both the random efffects vector and the random effects covariance matrix are calculated

tol.n

convergence tolerance for EM algorithm with method="N". Convergence is declared when (l_k-l_{k-1})/l_k < tol.n, where l_k is the log-likelihood at iteration k.

verbose

logical. If TRUE, model information will be printed after each iteration.

OT.flag

logical. If TRUE, then there should be a continuous column OT in game.data that indicates how many overtime periods there were for each game. The information will not be used for the binary models. NOTE: the game.data$OT column should not contain missing data. If there was no overtime, specify 0.

Hessian

logical. If TRUE, the Hessian of the model parameters is calculated via a central difference approximation.

REML.N

logical. If TRUE and if method=="N.mov" or method=="N", then REML estimation is used instead of ML.

Details

Setting first.order=TRUE will yield the first order Laplace approximation. A partial fully exponential Laplace approximation can be obtained by setting tol1 > tol2 and tolFE=0. This will apply fully exponential corrections to the vector of team ratings (the EBLUPs), but not to the covariance matrix of this vector. Karl, Yang, and Lohr (2014) show that this approach produces a large portion of the benefit of the fully exponential Laplace approximation in only a fraction of the time. Using the default tolerances of mvglmmRank leads to this behavior.

To summarize, the models (except for method="N") run with the first order Laplace approximation until the relative change between parameteres is <= tol1. If first.order=TRUE, the program stops. Otherwise, the program continues with the Laplace approximation, applying fully exponential corrections to the random effects vector until the maximum of the relative parameter changes is <= tolFE. At this point, the program continues using the complete fully exponential Laplace approximation (corrections to both the random effects vector and its covariance matrix) until the maximum relative parameter change is <= tol2. If tolFE < tol2, then the program will finish without applying fully exponential corrections to the random effects covariance matrix.

method="PB1" is the least scalable, as the memory and computational requirements for this model are at least O((teams+number of games)^2). In the example data included with the package, the NCAA basketball data is slow with the fully exponential approximation and method="PB1".

Value

mvglmmRank returns an object of class mvglmmRank

An object of class mvglmmRank is a list containing the following components:

n.ratings.offense

The vector of offensive ratings from the normal model, or NULL if the normal model was not fit.

n.ratings.defense

The vector of defensive ratings from the normal model, or NULL if the normal model was not fit.

p.ratings.offense

The vector of offensive ratings from the Poisson model, or NULL if the Poisson model was not fit.

p.ratings.defense

The vector of defensive ratings from the Poisson model, or NULL if the Poisson model was not fit.

b.offense

The vector of win-propensity ratings from the binary model, or NULL if the binary model was not fit.

n.mean

Mean scores from the normal model.

p.mean

Mean scores from the Poisson model.

b.mean

Home field effect from the binary model.

G

Single block of random effects covariance matrix.

G.cor

Correlation matrix corresponding to covariance matrix G.

R

Error covariance matrix for normal model, or NULL if normal model not used.

R.cor

Error correlation matrix for normal model, or NULL if normal model not used.

home.field

Logical indicating whether or not a home field effect was modeled.

Hessian

The Hessian of the model parameters, if requested.

parameters

A vector of fitted model parameters.

N.output

NULL, or a list if method="N" or method="N.mov". In the later cases, the list contains the random effect design matrix Z, the fixed effects design matrix X, the esitmated random effects covariance matrix G, the estimated error covariance matrix R, the predicted random effects eta, the joint covariance matrix of fixed and random effects ybetas_eblup_asycov, the covariance matrix of the fixed effects only ybetas_asycov, and the standard errrors of the fixed effects ybetas_stderror.

fixed.effect.model.output

NULL, or a list if method="N.mov". In the later case, the list contains information about the results of fitting the margin of victory model with fixed (instead of random) team effects: the fixed effect design matrix X, the fixed effect parameter estimates beta, logical indicating whether or not the home field effect is estimable is.mean.estimable (see Notes), the predicted margins of victory pred, the residuals resid, the fitted model variance sigma.sq, and the covariance matrix of the random effects beta.covariance. This can provide an unbiased estimate when the estimator from the mixed model is biased (Karl and Zimmerman, 2021).

The function game.pred may be used to predict the outcome of future games.

Author(s)

Andrew T. Karl akarl@asu.edu, Jennifer Broatch

References

Broatch, J.E. and Karl, A.T. (2018). Multivariate Generalized Linear Mixed Models for Joint Estimation of Sporting Outcomes. Italian Journal of Applied Statistics. Vol.30, No.2, 189-211. Also available from https://arxiv.org/abs/1710.05284.

Karl, A.T., Zimmerman, D.L. (2021). A Diagnostic for Bias in Linear Mixed Model Estimators Induced by Dependence Between the Random Effects and the Corresponding Model Matrix. Journal of Statistical Planning and Inference, 211, 107-118. https://doi.org/10.1016/j.jspi.2020.06.004.

Karl, A.T., Yang, Y. and Lohr, S. (2013). Efficient Maximum Likelihood Estimation of Multiple Membership Linear Mixed Models, with an Application to Educational Value-Added Assessments. Computational Statistics and Data Analysis, 59, 13-27.

Karl, A., Yang, Y. and Lohr, S. (2014) Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects. Computational Statistics & Data Analysis 73, 146–162.

Karl, A.T. (2012). The Sensitivity of College Football Rankings to Several Modeling Choices, Journal of Quantitative Analysis in Sports, Volume 8, Issue 3, DOI 10.1515/1559-0410.1471

See Also

See also game.pred

Examples

data(nfl2012)
mvglmmRank(nfl2012,method="PB0",first.order=TRUE,verbose=TRUE,max.iter.EM=1)

result <- mvglmmRank(nfl2012,method="PB0",first.order=TRUE,verbose=TRUE)
print(result)
game.pred(result,home="Denver Broncos",away="Green Bay Packers")


[Package mvglmmRank version 1.2-4 Index]