data_censoring {MSmix} | R Documentation |
Censoring of full rankings
Description
Convert full rankings into either top-k or MAR (missing at random) partial rankings.
Usage
data_censoring(
rankings,
type = "topk",
nranked = NULL,
probs = rep(1, ncol(rankings) - 1)
)
Arguments
rankings |
Integer |
type |
Character indicating which censoring process must be used. Options are: |
nranked |
Integer vector of length |
probs |
Numeric vector of the |
Details
Both forms of partial rankings can be obtained into two ways: (i) by specifying, in the nranked
argument, the number of positions to be retained in each partial ranking; (ii) by setting nranked = NULL
(default) and specifying, in the probs
argument, the probabilities of retaining respectively positions in the partial rankings (recall that a partial sequence with
observed entries corresponds to a full ranking).
In the censoring process of full rankings into MAR partial sequences, the positions to be retained are uniformly generated.
Value
A list of two named objects:
part_rankings
Integer
matrix with partial (censored) rankings in each row. Missing positions must be coded as
NA
.nranked
Integer vector of length
with the actual number of items ranked in each partial sequence after censoring.
Examples
## Example 1. Censoring the Antifragility dataset into partial top rankings
# Top-3 censoring (assigned number of top positions to be retained)
n <- 7
r_antifrag <- ranks_antifragility[, 1:n]
data_censoring(r_antifrag, type = "topk", nranked = rep(3,nrow(r_antifrag)))
# Random top-k censoring with assigned probabilities
set.seed(12345)
data_censoring(r_antifrag, type = "topk", probs = 1:(n-1))
## Example 2. Simulate full rankings from a basic Mallows model with Spearman distance
n <- 10
N <- 100
set.seed(12345)
rankings <- rMSmix(sample_size = N, n_items = n)$samples
# MAR censoring with assigned number of positions to be retained
set.seed(12345)
nranked <- round(runif(N,0.5,1)*n)
set.seed(12345)
mar_ranks1 <- data_censoring(rankings, type = "mar", nranked = nranked)
mar_ranks1
identical(mar_ranks1$nranked, nranked)
# MAR censoring with assigned probabilities
set.seed(12345)
probs <- runif(n-1, 0, 0.5)
set.seed(12345)
mar_ranks2 <- data_censoring(rankings, type = "mar", probs = probs)
mar_ranks2
prop.table(table(mar_ranks2$nranked))
round(prop.table(probs), 2)