| data_censoring {MSmix} | R Documentation |
Censoring of full rankings
Description
Convert full rankings into either top-k or MAR (missing at random) partial rankings.
Usage
data_censoring(
rankings,
type = "topk",
nranked = NULL,
probs = rep(1, ncol(rankings) - 1)
)
Arguments
rankings |
Integer |
type |
Character indicating which censoring process must be used. Options are: |
nranked |
Integer vector of length |
probs |
Numeric vector of the |
Details
Both forms of partial rankings can be obtained into two ways: (i) by specifying, in the nranked argument, the number of positions to be retained in each partial ranking; (ii) by setting nranked = NULL (default) and specifying, in the probs argument, the probabilities of retaining respectively 1, 2, ..., (n-1) positions in the partial rankings (recall that a partial sequence with (n-1) observed entries corresponds to a full ranking).
In the censoring process of full rankings into MAR partial sequences, the positions to be retained are uniformly generated.
Value
A list of two named objects:
part_rankingsInteger
N\timesnmatrix with partial (censored) rankings in each row. Missing positions must be coded asNA.nrankedInteger vector of length
Nwith the actual number of items ranked in each partial sequence after censoring.
Examples
## Example 1. Censoring the Antifragility dataset into partial top rankings
# Top-3 censoring (assigned number of top positions to be retained)
n <- 7
r_antifrag <- ranks_antifragility[, 1:n]
data_censoring(r_antifrag, type = "topk", nranked = rep(3,nrow(r_antifrag)))
# Random top-k censoring with assigned probabilities
set.seed(12345)
data_censoring(r_antifrag, type = "topk", probs = 1:(n-1))
## Example 2. Simulate full rankings from a basic Mallows model with Spearman distance
n <- 10
N <- 100
set.seed(12345)
rankings <- rMSmix(sample_size = N, n_items = n)$samples
# MAR censoring with assigned number of positions to be retained
set.seed(12345)
nranked <- round(runif(N,0.5,1)*n)
set.seed(12345)
mar_ranks1 <- data_censoring(rankings, type = "mar", nranked = nranked)
mar_ranks1
identical(mar_ranks1$nranked, nranked)
# MAR censoring with assigned probabilities
set.seed(12345)
probs <- runif(n-1, 0, 0.5)
set.seed(12345)
mar_ranks2 <- data_censoring(rankings, type = "mar", probs = probs)
mar_ranks2
prop.table(table(mar_ranks2$nranked))
round(prop.table(probs), 2)