smlik {ohenery} | R Documentation |
Softmax log likelihood under Harville and Henery Models.
Description
Compute the softmax log likelihood and gradient of the same.
Usage
harsmlik(g, idx, eta, wt = NULL, deleta = NULL)
hensmlik(g, idx, eta, gamma, wt = NULL, deleta = NULL)
Arguments
g |
a vector giving the group indices. |
idx |
a vector of integers, the same length as |
eta |
a vector of the odds.
Must be the same length as |
wt |
an optional vector of non-negative elements, the same length
as |
deleta |
an optional matrix whose row count equals the number of elements
of |
gamma |
a vector of the gamma parameters. It is assumed that the
first element is |
Details
Given vectors g
, \eta
and optionally the gradient of \eta
with respect to some coefficients, computes the log likelihood under the
softmax. The user must provide a reverse ordering as well, which is sorted
first by the groups, g
, and then, within a group, in increasing
quality order. For a race, this means that the index is in order from the
last place to the first place in that race, where the group numbers
correspond to one race.
Under the Harville model, the log likelihood on a given group, where we are indexing in forward order, is
\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...
where \mu_i = \exp{\eta_i}
.
By “forward order”, we mean that \eta_1
corresponds to the
participant taking first place within that group, \eta_2
took second
place, and so on.
The Henery model log likelihood takes the form
\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\gamma_2 \eta_2 - \log \sum_{j \ge 2} \mu_j^{\gamma_2}\right) + ...
for gamma parameters, \gamma
.
The Henery model corresponds to the Harville model where all the gammas equal 1.
Weights in weighted estimation apply to each summand. The weight for the last place participant in a group is irrelevant. The weighted log likelihood under the Harville model is
w_1\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + w_2\left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...
One should think of the weights as applying to the outcome, not the participant.
Value
The log likelihood. If deleta
is given, we add an attribute
to the scalar number, called gradient
giving the derivative.
For the Henery model we also include a term of gradgamma
which is
the gradient of the log likelihood with respect to the gamma vector.
Note
The likelihood function does not yet support ties.
To avoid incorrect inference when only the top performers are recorded, and all others are effectively tied, one should use weighting. Set the weights to zero for participants who are tied non-winners, and one for the rest So for example, if you observe the Gold, Silver, and Bronze medal winners of an Olympic event that had a starting field of 12 participants, set weights to 1 for the medal winners, and 0 for the others. Note that the weights do not attach to the participants, they attach to the place they took.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
Harville, D. A. "Assigning probabilities to the outcomes of multi-entry competitions." Journal of the American Statistical Association 68, no. 342 (1973): 312-316. http://dx.doi.org/10.1080/01621459.1973.10482425
Henery, R. J. "Permutation probabilities as models for horse races." Journal of the Royal Statistical Society: Series B (Methodological) 43, no. 1 (1981): 86-91. http://dx.doi.org/10.1111/j.2517-6161.1981.tb01153.x
Examples
# a garbage example
set.seed(12345)
g <- as.integer(sort(ceiling(20 * runif(100))))
idx <- as.integer(rev(1:length(g)) - 1L)
eta <- rnorm(length(g))
foo <- harsmlik(g=g,idx=idx,eta=eta,deleta=NULL)
# an example with a real idx
nfeat <- 5
set.seed(1234)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)
idx <- order(g,y,decreasing=TRUE) - 1
foores <- harsmlik(g,idx,eta,deleta=X)
# now reweight them
wt <- runif(length(g))
wt <- wt / mean(wt) # mean 1 is recommended
foores <- harsmlik(g,idx,eta,wt=wt)
# try hensmlik
foores <- hensmlik(g,idx,eta,gamma=c(0.9,0.8,1),wt=wt)
# check the value of the gradient by numerical approximation
nfeat <- 8
set.seed(321)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)
idx <- order(g,y,decreasing=TRUE) - 1
if (require(numDeriv)) {
fastval <- attr(harsmlik(g,idx,eta,deleta=X),'gradient')
numap <- grad(function(beta,g,idx,X) {
eta <- X %*% beta
as.numeric(harsmlik(g,idx,eta))
},
x=beta,g=g,idx=idx,X=X)
rbind(fastval,numap)
}