ranking {dplyr} | R Documentation |
Six variations on ranking functions, mimicking the ranking functions
described in SQL2003. They are currently implemented using the built in
rank
function, and are provided mainly as a convenience when
converting between R and SQL. All ranking functions map smallest inputs
to smallest outputs. Use desc()
to reverse the direction.
row_number(x)
ntile(x = row_number(), n)
min_rank(x)
dense_rank(x)
percent_rank(x)
cume_dist(x)
x |
a vector of values to rank. Missing values are left as is. If you want to treat them as the smallest or largest values, replace with Inf or -Inf before ranking. |
n |
number of groups to split up into. |
row_number()
: equivalent to rank(ties.method = "first")
min_rank()
: equivalent to rank(ties.method = "min")
dense_rank()
: like min_rank()
, but with no gaps between
ranks
percent_rank()
: a number between 0 and 1 computed by
rescaling min_rank
to [0, 1]
cume_dist()
: a cumulative distribution function. Proportion
of all values less than or equal to the current rank.
ntile()
: a rough rank, which breaks the input vector into
n
buckets. The size of the buckets may differ by up to one,
larger buckets have lower rank.
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
min_rank(x)
dense_rank(x)
percent_rank(x)
cume_dist(x)
ntile(x, 2)
ntile(1:8, 3)
# row_number can be used with single table verbs without specifying x
# (for data frames and databases that support windowing)
mutate(mtcars, row_number() == 1L)
mtcars %>% filter(between(row_number(), 1, 10))