RankData-class {rankdist}R Documentation

RankData Class

Description

A S4 class to represent ranking data

It is well understood that the ranking representation and ordering representation of ranking data can easily be confused. I thus use a S4 class to store all the information about the ranking data. This can avoid unnecessary confusion.

Details

It is possible to store both complete and top-q rankings in the same RankData object. Three slots topq, subobs, and q_ind are introduced for this purpose. Note that there is generally no need to specify these slots if your data set only contains a single "q" level (for example all data are top-10 rankings). The "q" level for complete ranking should be nobj-1. Moreover, if the rankings are organized in chunks of increasing "q" levels (for example, top-2 rankings followed by top-3 rankings followed by top-5 rankings etc.), then slots subobs, and q_ind can also be inferred correctly by the initializer. Therefore it is highly recommender that you organise the ranking matrix in this way and utilize the initializer.

Slots

nobj

The number of ranked objects. If not provided, it will be inferred as the maximum ranking in the data set. As a result, it must be provided if the data is top-q ranking.

nobs

the number of observations. No need to be provided during initialization since it must be equal to the sum of slot count.

ndistinct

the number of distinct rankings. No need to be provided during initialization since it must be equal to the number of rows of slot ranking.

ranking

a matrix that stores the ranking representation of distinct rankings. Each row contains one ranking. For top-q ranking, all unobserved objects have ranking q+1.

count

the number of observations for each distinct ranking corresponding to each row of ranking.

topq

a numeric vector to store top-q ranking information. More information in details section.

subobs

a numeric vector to store number of observations for each chunk of top-q rankings.

q_ind

a numeric vector to store the beginning and ending of each chunk of top-q rankings. The last element has to be ndistinct+1.

References

Qian Z, Yu L. H. P (2019) "Weighted Distance-Based Models for Ranking Data Using the R Package rankdist." Journal of Statistical Software, 90(5), 1-31. doi: 10.18637/jss.v090.i05

See Also

RankInit, RankControl

Examples

# creating a data set with only complete rankings
rankmat <- replicate(10,sample(1:52,52), simplify = "array")
countvec <- sample(1:52,52,replace=TRUE)
rankdat <- new("RankData",ranking=rankmat,count=countvec)
# creating a data set with both complete and top-10 rankings
rankmat_in <- replicate(10,sample(1:52,52), simplify = "array")
rankmat_in[rankmat_in>11] <- 11
rankmat_total <- cbind(rankmat_in, rankmat)
countvec_total <- c(countvec,countvec)
rankdat2 <- new("RankData",ranking=rankmat_total,count=countvec_total, nobj=52, topq=c(10,51))

[Package rankdist version 1.1.4 Index]