data_augmentation {MSmix}R Documentation

Data augmentation of partial rankings

Description

For a given partial ranking matrix, generate all possible full rankings which are compatible with each partially ranked sequence. Partial rankings with at most 10 missing positions and arbitrary patterns of censoring are supported.

Usage

data_augmentation(rankings, subset = NULL, fill_single_na = TRUE)

Arguments

rankings

Integer N\timesn matrix with partial rankings in each row. Missing positions must be coded as NA.

subset

Optional logical or integer vector specifying the subset of observations, i.e. rows of the rankings, to be kept. Missing values are taken as FALSE.

fill_single_na

Logical: whether single missing positions in the row of rankings must be filled in prior to data augmentation. Defaults to TRUE.

Details

The data augmentation of a full ranking returns the complete ranking itself arranged in a row vector. The function can be applied on partial observations expressed in ordering format as well. A message informs the user when the augmentation may be heavy, before proceeding.

Value

A list of N elements corresponding to the matrices of full rankings compatible with each partial sequence.

References

Crispino M, Mollica C, Astuti V and Tardella L (2023). Efficient and accurate inference for mixtures of Mallows models with Spearman distance. Statistics and Computing, 33(98), DOI: 10.1007/s11222-023-10266-8.

Examples


## Example 1. Data augmentation of a single partial top-9 ranking.
data_augmentation(c(3, 7, 5, 1, NA, 4, NA, 8, 2, 6, NA, 9))

## Example 2. Data augmentation of partial rankings with different censoring patterns.
rank_data <- rbind(c(NA, 4, NA, 1, NA),
                   c(NA, NA, NA, NA, 1),
                   c(2, NA, 1, NA, 3),
                   c(4, 2, 3, 5, 1),
                   c(NA, 4, 1, 3, 2))
data_augmentation(rank_data)




[Package MSmix version 1.0.1 Index]