RankData-class {rankdist} | R Documentation |
RankData Class
Description
A S4 class to represent ranking data
It is well understood that the ranking representation and ordering representation of ranking data can easily be confused. I thus use a S4 class to store all the information about the ranking data. This can avoid unnecessary confusion.
Details
It is possible to store both complete and top-q rankings in the same RankData object. Three slots topq
, subobs
, and
q_ind
are introduced for this purpose. Note that there is generally no need to specify these slots if your data set only contains
a single "q" level (for example all data are top-10 rankings). The "q" level for complete ranking should be nobj-1
.
Moreover, if the rankings are organized in chunks of increasing "q" levels (for
example, top-2 rankings followed by top-3 rankings followed by top-5 rankings etc.), then slots subobs
, and q_ind
can also be inferred
correctly by the initializer. Therefore it is highly recommender that you organise the ranking matrix in this way and utilize the initializer.
Slots
nobj
The number of ranked objects. If not provided, it will be inferred as the maximum ranking in the data set. As a result, it must be provided if the data is top-q ranking.
nobs
the number of observations. No need to be provided during initialization since it must be equal to the sum of slot
count
.ndistinct
the number of distinct rankings. No need to be provided during initialization since it must be equal to the number of rows of slot
ranking
.ranking
a matrix that stores the ranking representation of distinct rankings. Each row contains one ranking. For top-q ranking, all unobserved objects have ranking
q+1
.count
the number of observations for each distinct ranking corresponding to each row of
ranking
.topq
a numeric vector to store top-q ranking information. More information in details section.
subobs
a numeric vector to store number of observations for each chunk of top-q rankings.
q_ind
a numeric vector to store the beginning and ending of each chunk of top-q rankings. The last element has to be
ndistinct+1
.
References
Qian Z, Yu L. H. P (2019) "Weighted Distance-Based Models for Ranking Data Using the R Package rankdist." Journal of Statistical Software, 90(5), 1-31. doi: 10.18637/jss.v090.i05
See Also
Examples
# creating a data set with only complete rankings
rankmat <- replicate(10,sample(1:52,52), simplify = "array")
countvec <- sample(1:52,52,replace=TRUE)
rankdat <- new("RankData",ranking=rankmat,count=countvec)
# creating a data set with both complete and top-10 rankings
rankmat_in <- replicate(10,sample(1:52,52), simplify = "array")
rankmat_in[rankmat_in>11] <- 11
rankmat_total <- cbind(rankmat_in, rankmat)
countvec_total <- c(countvec,countvec)
rankdat2 <- new("RankData",ranking=rankmat_total,count=countvec_total, nobj=52, topq=c(10,51))