welofit {welo}R Documentation

Calculates the WElo and Elo rates

Description

Calculates the WElo and Elo rates according to Angelini et al. (2022). In particular, the Elo updating system defines the rates (for player i) as:

E_{i}(t+1) = E_{i}(t) + K_i(t) \left[W_{i}(t)- \hat{p}_{i,j}(t) \right],

where E_{i}(t) is the Elo rate at time t, W_{i}(t) is the outcome (1 or 0) for player i in the match at time t, K_i(t) is a scale factor, and \hat{p}_{i,j}(t) is the probability of winning for match at time t, calculated using tennis_prob. The scale factor K_i(t) determines how much the rates change over time. By default, according to Kovalchik (2016), it is defined as

K_i(t)=250/\left(N_i(t)+5\right)^{0.4},

where N_i(t) is the number of matches disputed by player i up to time t. Alternately, K_i(t) can be multiplied by 1.1 if the match at time t is a Grand Slam match or is played on a given surface. Finally, it can be fixed to a constant value. The WElo rating system is defined as:

E_{i}^\ast(t+1) = E_{i}^\ast(t) + K_i(t) \left[W_{i}(t)- \hat{p}_{i,j}^\ast(t) \right] f(W_{i,j}(t)),

where E_{i}^\ast(t+1) denotes the WElo rate for player i, \hat{p}_{i,j}^\ast(t) the probability of winning using tennis_prob and the WElo rates, and f(W_{i,j}(t)) represents a function whose values depend on the games (by default) or sets won in the previous match. In particular, when parameter 'W' is set to "GAMES", f(W_{i,j}(t)) is defined as:

f(W_{i,j}(t)) \equiv f(G_{i,j}(t))= \left\{ \begin{array}{ll} \frac{NG_i(t)}{NG_i(t)+NG_j(t)} \quad if~player~i~has~won~match~t;\\ \frac{NG_j(t)}{NG_i(t)+NG_j(t)} \quad if~player~i~has~lost~match~t, \end{array} \right.

where NG_i(t) and NG_j(t) represent the number of games won by player i and player j in match t, respectively. When parameter 'W' is set to "SET", f(W_{i,j}(t)) is:

f(W_{i,j}(t)) \equiv f(S_{i,j}(t))= \left\{ \begin{array}{ll} \frac{NS_i(t)}{NS_i(t)+NS_j(t)} \quad if~player~i~has~won~match~t;\\ \frac{NS_j(t)}{NS_i(t)+NS_j(t)} \quad if~player~i~has~lost~match~t, \end{array} \right.

where NS_i(t) and NS_j(t) represent the number of sets won by player i and player j in match t, respectively. The scale factor K_i(t) is the same as the Elo model.

Usage

welofit(
  x,
  W = "GAMES",
  SP = 1500,
  K = "Kovalchik",
  CI = FALSE,
  alpha = 0.05,
  B = 1000,
  new_data = NULL
)

Arguments

x

Data cleaned through the function clean or, if the parameter 'new_data' is present, a former estimated list coming from the welofit function

W

optional Weights to use for the WElo rating system. Valid choices are: "GAMES" (by default) and "SETS"

SP

optional Starting points for calculating the rates. 1500 by default

K

optional Scale factor determining how much the WElo and Elo rates change over time. Valid choices are: "Kovalchik" (by default), "Grand_Slam", "Surface_Hard", "Surface_Grass", "Surface_Clay" and, finally, a constant value K. The first option ("Kovalchik") is equal to what was suggested by Kovalchik (2016), Putting K to "Grand_Slam" lets the Kovalchik scale factor multiplied by 1.1, if the match is a Grand Slam match. Similarly, the choices "Surface_Hard", "Surface_Grass" and "Surface_Clay" make the Kovalchik scale factor increased by 1.1 if, respectively, the match is played on hard, grass or clay. Finally, K can be any scalar value, indipendently of the number of matches played before the match t

CI

optional Confidence intervals for the WElo and Elo rates. Default to FALSE. If 'CI' is set to "TRUE", then the confidence intervals are calculated, according to the procedure explained by Angelini et al. (2022)

alpha

optional Significance level of the confidence interval. Default to 0.05

B

optional Number of bootstrap samples used to calculate the confidence intervals. Default to 1000

new_data

optional New data, cleaned through the function clean, to append to an already estimated set of matches (included in the parameter 'x')

Value

welofit returns an object of class 'welo', which is a list containing the following components:

References

Angelini G, Candila V, De Angelis L (2022). “Weighted Elo rating for tennis match predictions.” European Journal of Operational Research, 297(1), 120–132.

Brier GW (1950). “Verification of forecasts expressed in terms of probability.” Monthly weather review, 78(1), 1–3.

Kovalchik SA (2016). “Searching for the GOAT of tennis win prediction.” Journal of Quantitative Analysis in Sports, 12(3), 127–138.

Examples


data(atp_2019) 
db_clean<-clean(atp_2019)
res<-welofit(db_clean)
# append new data
db_clean_1<-db_clean[1:500,]
db_clean_2<-db_clean[501:1200,]
res_1<-welofit(db_clean_1)
res_2<-welofit(res_1,new_data=db_clean_2)


[Package welo version 0.1.4 Index]