data_censoring {MSmix} | R Documentation |
Censoring of full rankings
Description
Convert full rankings into either top-k or MAR (missing at random) partial rankings.
Usage
data_censoring(
rankings,
type = "topk",
nranked = NULL,
probs = rep(1, ncol(rankings) - 1)
)
Arguments
rankings |
Integer |
type |
Character indicating which censoring process must be used. Options are: |
nranked |
Integer vector of length |
probs |
Numeric vector of the |
Details
Both forms of partial rankings can be obtained into two ways: (i) by specifying, in the nranked
argument, the number of positions to be retained in each partial ranking; (ii) by setting nranked = NULL
(default) and specifying, in the probs
argument, the probabilities of retaining respectively 1, 2, ..., (n-1)
positions in the partial rankings (recall that a partial sequence with (n-1)
observed entries corresponds to a full ranking).
In the censoring process of full rankings into MAR partial sequences, the positions to be retained are uniformly generated.
Value
A list of two named objects:
part_rankings
Integer
N
\times
n
matrix with partial (censored) rankings in each row. Missing positions must be coded asNA
.nranked
Integer vector of length
N
with the actual number of items ranked in each partial sequence after censoring.
Examples
## Example 1. Censoring the Antifragility dataset into partial top rankings
# Top-3 censoring (assigned number of top positions to be retained)
n <- 7
r_antifrag <- ranks_antifragility[, 1:n]
data_censoring(r_antifrag, type = "topk", nranked = rep(3,nrow(r_antifrag)))
# Random top-k censoring with assigned probabilities
set.seed(12345)
data_censoring(r_antifrag, type = "topk", probs = 1:(n-1))
## Example 2. Simulate full rankings from a basic Mallows model with Spearman distance
n <- 10
N <- 100
set.seed(12345)
rankings <- rMSmix(sample_size = N, n_items = n)$samples
# MAR censoring with assigned number of positions to be retained
set.seed(12345)
nranked <- round(runif(N,0.5,1)*n)
set.seed(12345)
mar_ranks1 <- data_censoring(rankings, type = "mar", nranked = nranked)
mar_ranks1
identical(mar_ranks1$nranked, nranked)
# MAR censoring with assigned probabilities
set.seed(12345)
probs <- runif(n-1, 0, 0.5)
set.seed(12345)
mar_ranks2 <- data_censoring(rankings, type = "mar", probs = probs)
mar_ranks2
prop.table(table(mar_ranks2$nranked))
round(prop.table(probs), 2)