BTdecayLasso {BTdecayLasso}R Documentation

Bradley-Terry Model with Exponential Decayed weighted likelihood and Adaptive Lasso

Description

Bradley-Terry model is applied for paired comparison data. Teams' ability score is estimated by maximizing log-likelihood function.

To achieve a better track of current abilities, we apply an exponential decay rate to weight the log-likelihood function. The most current matches will weight more than previous matches. Parameter "decay.rate" in most functions of this package is used to set the amount of exponential decay rate. decay.rate should be non-negative and the appropriate range of it depends on time scale in original dataframe. (see BTdataframe and parameter "dataframe"'s definition of fifth column) For example, a unit of week with a "decay.rate" 0.007 is equivalent to the unit of day with "decay.rate" 0.001. Usually, for sports matches, if we take the unit of day, it's ranging from 0 to 0.01. The higher choice of "decay.rate", the better track of current teams' ability with a side effect of higher variance.

If "decay.rate" is too large, for example "0.1" with a unit of day, \exp(-0.7) = 0.50. Only half weight will be add to the likelihood for matches played one week ago and \exp(-3.1) = 0.05 suggests that previous matches took place one month ago will have little effect. Therefore, Only a few matches are accounted for ability's estimation. It will lead to a very high variance and uncertainty. Since standard Bradley-Terry model can not handle the case where there is a team who wins or loses all matches, such estimation may not provide convergent results. Thus, if our estimation provides divergent result, an error will be returned and we suggest user to chose a smaller "decay.rate" or adding more match results into the same modeling period.

By default, the Adaptive Lasso is implemented for variance reduction and team's grouping. Adaptive Lasso is proved to have good grouping property. Apart from adaptive lasso, user can define own weight for different Lasso constraint ≤ft|μ_{i}-μ_{j}\right| where μ_{i} is team i's ability.

Also by default, the whole Lasso path will be run. Similar to package "glmnet", user can provide their own choice of Lasso penalty "lambda" and determine whether the whole Lasso path will be run (since such run is time-consuming). However, we suggest that if user is not familiar with the actual relationship among lambda, the amount of penalty, the amount of shrinkage and grouping effect, a whole Lasso path should be run and selection of an appropriate lambda is done by AIC or BIC criteria using BTdecayLassoC (since this model is time related, cross-validation method cannot be applied). Also, users can use BTdecayLassoF to run with a specific Lasso penalty ranging from 0 to 1 (1 penalty means all estimators will shrink to 0).

Two sets of estimated abilities will be given, the biased Lasso estimation and the HYBRID Lasso's estimation. HYBRID Lasso estimation solves the restricted Maximum Likelihood optimization based on the group determined by Lasso's estimation (Different team's ability will converges to the same value if Lasso penalty is added and these teams' ability is setting to be equal as a restriction).

In addition, summary() using S3 method can be applied to view the outputs.

Usage

BTdecayLasso(dataframe, ability, lambda = NULL, weight = NULL,
  path = TRUE, decay.rate = 0, fixed = 1, thersh = 1e-05, max = 100,
  iter = 100)

Arguments

dataframe

Generated using BTdataframe given raw data.

ability

A column vector of teams ability, the last row is the home parameter. The row number is consistent with the team's index shown in dataframe. It can be generated using BTdataframe given raw data.

lambda

The amount of Lasso penalty induced. The input should be a positive scalar or a sequence.

weight

Weight for Lasso penalty on different abilities.

path

whether the whole Lasso path will be run (plot.BTdecayLasso is enabled only if path = TRUE)

decay.rate

A non-negative exponential decay rate. Usually ranging from (0, 0.01), A larger decay rate weights more importance to most recent matches and the estimated parameters reflect more on recent behaviour.

fixed

A teams index whose ability will be fixed as 0. The worstTeam's index can be generated using BTdataframe given raw data.

thersh

Threshold for convergence used for Augmented Lagrangian Method.

max

Maximum weight for w_ij (weight used for Adaptive Lasso)

iter

Number of iterations used in L-BFGS-B algorithm.

Details

According to BTdecay, the objective likelihood function to be optimized is,

∑_{k=1}^{n}∑_{i<j}\exp(-α t_{k})\cdot(y_{ij}(τ h_{ij}^{t_{k}}+μ_{i}-μ_{j})-\log(1+\exp(τ h_{ij}^{t_{k}}+μ_{i}-μ_{j})))

The Lasso constraint is given as,

∑_{i<j}w_{ij}≤ft|μ_{i}-μ_{j}\right|≤q s

where w_{ij} are predefined weight. For Adaptive Lasso, ≤ft|w_{ij}=1/(μ_{i}^{MLE}-μ_{j}^{MLE})\right|.

Maximize this constraint objective function is equivalent to minimizing the following equation,

-l(μ,τ)+λ∑_{i<j}w_{ij}|μ_{i}-μ_{j}|

Where -l(μ,τ) is taking negative value of objective function above. Increase "lambda" will decrease "s", their relationship is monotone. Here, we define "penalty" as 1-s/\max(s). Thus, "lambda" and "penalty" has a positive correlation.

Value

ability

Estimated ability scores with user given lambda

likelihood

Negative likelihood of objective function with user given lambda

df

Degree of freedom with user given lambda(number of distinct μ)

penalty

s/max(s) with user given lambda

Lambda

User given lambda

ability.path

if path = TRUE, estimated ability scores on whole Lasso path

likelihood.path

if path = TRUE, negative likelihood of objective function on whole Lasso path

df.path

if path = TRUE, degree of freedom on whole Lasso path(number of distinct μ)

penalty.path

if path = TRUE, s/max(s) on whole Lasso path

Lambda.path

if path = TRUE, Whole Lasso path

path

Whether whole Lasso path will be run

HYBRID.ability.path

If path = TRUE, the whole path of evolving of HYBRID ability

HYBRID.likelihood.path

if path = TRUE, the whole path of HYBRID likelihood

References

Masarotto, G. and Varin, C.(2012) The Ranking Lasso and its Application to Sport Tournaments. *The Annals of Applied Statistics* **6** 1949–1970.

Zou, H. (2006) The adaptive lasso and its oracle properties. *J.Amer.Statist.Assoc* **101** 1418–1429.

See Also

BTdataframe for dataframe initialization, plot.swlasso, plot.wlasso are used for Lasso path plot if path = TRUE in this function's run

Examples

##Initializing Dataframe
x <- BTdataframe(NFL2010)

##The following code runs the main results
##Usually a single lambda's run will take 1-20 s
##The whole Adaptive Lasso run will take 5-20 min

##BTdecayLasso run with exponential decay rate 0.005 and 
##lambda 0.1 on whole lasso path using adaptive lasso
y1 <- BTdecayLasso(x$dataframe, x$ability, lambda = 0.1, 
                   decay.rate = 0.005, fixed = x$worstTeam)
summary(y1)

##Defining equal weight
##Note that comparing to Adaptive weight, the user defined weight may not be 
##efficient in groupiing. Therefore, to run the whole Lasso path 
##(evolving of distinct ability scores), it may take a much longer time. 
##We recommend the user to apply the default setting,
##where Adaptive Lasso will be run.
n <- nrow(x$ability) - 1
w2 <- matrix(1, nrow = n, ncol = n)
w2[lower.tri(w2, diag = TRUE)] <- 0

##BTdecayLasso run with exponential decay rate 0.005 and with a specific lambda 0.1
y2 <- BTdecayLasso(x$dataframe, x$ability, lambda = 0.1, weight = w2, 
                   path = FALSE, decay.rate = 0.005, fixed = x$worstTeam)

##BTdecayLasso run with exponential decay rate 0.005 and with a specific lambda 0.1
##Time-consuming
y3 <- BTdecayLasso(x$dataframe, x$ability, lambda = 0.1, weight = w2, 
                   path = TRUE, decay.rate = 0.005, fixed = x$worstTeam)
summary(y2)

##Plot the Lasso path (S3 method)
plot(y1)
plot(y3)



[Package BTdecayLasso version 0.1.0 Index]