calculate.maxK {TopKLists} | R Documentation |
The main function for TopKInference
Description
Returns a complex object named truncated.lists containing the Idata
vector (see prepare.idata
), the estimated truncation index j_0=k+1
(see compute.stream
) for each pair of input lists, the overall top-k estimate (see j0.multi
), and other objects with necessary plotting information for the aggmap
Usage
calculate.maxK(lists, L, d, v, threshold)
Arguments
lists |
Data frame containing two or more columns that represent input lists of ordered objects subject to comparison |
L |
Number of input lists that are compared |
d |
The maximal distance delta between object ranks required for the estimation of |
v |
The pilot sample size (tuning parameter) |
threshold |
The percentage of occurencies of an object in the top-k selection among all comparisons in order to be gray-shaded in the |
Value
A named list of the following content:
comparedLists |
Contains information about the overlap of all pairwise compared lists (structure for the |
info |
Contains information about the list names |
grayshadedLists |
Contains information which objects in a list are consolidated (gray-shaded in the |
summarytable |
Table of top-k list overlaps containing rank information, the rank sum, the order of objects as a function of the rank sum, the frequency of an object in the input lists and the frequency of an object in the truncated lists (for plotting in the |
vennlists |
Contains the top-k objects for each of the input lists (for display in the Venn-diagram) |
venntable |
Contains the overlap information (for display in the Venn-table) |
v |
Selected pilot sample size (tuning parameter) |
Ntoplot |
Number of columns to be plotted in the |
Idata |
Data frame of Idata vectors (see |
d |
selected delta |
threshold |
selected threshold |
threshold |
number of lists |
N |
number of items in data frame (lists) |
lists |
data frame of lists that entered the analysis |
maxK |
maximal estimate of the top-k's (for all pairwise comparisons) |
topkspace |
the final integrated list of objects as result of the CEMC algorithm applied to the maxK truncated lists |
Author(s)
Eva Budinska <budinska@iba.muni.cz>, Michael G. Schimek <michael.schimek@medunigraz.at>
References
Hall, P. and Schimek, M. G. (2012). Moderate deviation-based inference for random degeneration in paired rank lists. J. Amer. Statist. Assoc., 107, 661-672.
See Also
Examples
set.seed(1234)
data(breast)
truncated.lists = calculate.maxK(breast, d=6, v=10, L=3, threshold=50)
## Not run:
aggmap(truncated.lists)
## End(Not run)