dbplyr-slice {dbplyr} | R Documentation |
Subset rows using their positions
Description
These are methods for the dplyr generics slice_min()
, slice_max()
, and
slice_sample()
. They are translated to SQL using filter()
and
window functions (ROWNUMBER
, MIN_RANK
, or CUME_DIST
depending on
arguments). slice()
, slice_head()
, and slice_tail()
are not supported
since database tables have no intrinsic order.
If data is grouped, the operation will be performed on each group so that
(e.g.) slice_min(db, x, n = 3)
will select the three rows with the smallest
value of x
in each group.
Usage
## S3 method for class 'tbl_lazy'
slice_min(
.data,
order_by,
...,
n,
prop,
by = NULL,
with_ties = TRUE,
na_rm = TRUE
)
## S3 method for class 'tbl_lazy'
slice_max(
.data,
order_by,
...,
n,
by = NULL,
prop,
with_ties = TRUE,
na_rm = TRUE
)
## S3 method for class 'tbl_lazy'
slice_sample(.data, ..., n, prop, by = NULL, weight_by = NULL, replace = FALSE)
Arguments
.data |
A lazy data frame backed by a database query. |
order_by |
Variable or function of variables to order by. |
... |
Not used. |
n , prop |
Provide either If |
by |
< |
with_ties |
Should ties be kept together? The default, |
na_rm |
Should missing values in |
weight_by , replace |
Not supported for database backends. |
Examples
library(dplyr, warn.conflicts = FALSE)
db <- memdb_frame(x = 1:3, y = c(1, 1, 2))
db %>% slice_min(x) %>% show_query()
db %>% slice_max(x) %>% show_query()
db %>% slice_sample() %>% show_query()
db %>% group_by(y) %>% slice_min(x) %>% show_query()
# By default, ties are includes so you may get more rows
# than you expect
db %>% slice_min(y, n = 1)
db %>% slice_min(y, n = 1, with_ties = FALSE)
# Non-integer group sizes are rounded down
db %>% slice_min(x, prop = 0.5)