sset {cheapr} | R Documentation |
Cheaper subset
Description
Cheaper alternative to [
that consistently subsets data frame
rows, always returning a data frame. There are explicit methods for
enhanced data frames like tibbles, data.tables and sf.
Usage
sset(x, ...)
## S3 method for class 'Date'
sset(x, i, ...)
## S3 method for class 'POSIXct'
sset(x, i, ...)
## S3 method for class 'factor'
sset(x, i, ...)
## S3 method for class 'data.frame'
sset(x, i, j, ...)
## S3 method for class 'tbl_df'
sset(x, i, j, ...)
## S3 method for class 'POSIXlt'
sset(x, i, j, ...)
## S3 method for class 'data.table'
sset(x, i, j, ...)
## S3 method for class 'sf'
sset(x, i, j, ...)
Arguments
x |
Vector or data frame. |
... |
Further parameters passed to |
i |
A logical or vector of indices. |
j |
Column indices, names or logical vector. |
Details
sset
is an S3 generic.
You can either write methods for sset
or [
.
sset
will fall back on using [
when no suitable method is found.
To get into more detail, using sset()
on a data frame, a new
list is always allocated through new_list()
.
Difference to base R
When i
is a logical vector, it is passed directly to which_()
.
This means that NA
values are ignored and this also means that i
is not recycled, so it is good practice to make sure the logical vector
matches the length of x. To return NA
values, use sset(x, NA_integer_)
.
ALTREP range subsetting
When i
is an ALTREP compact sequence which can be commonly created
using e.g. 1:10
or using seq_len
, seq_along
and seq.int
,
sset
internally uses a range-based subsetting method which is faster and doesn't
allocate i
into memory.
Value
A new vector, data frame, list, matrix or other R object.
Examples
library(cheapr)
library(bench)
# Selecting columns
sset(airquality, j = "Temp")
sset(airquality, j = 1:2)
# Selecting rows
sset(iris, 1:5)
# Rows and columns
sset(iris, 1:5, 1:5)
sset(iris, iris$Sepal.Length > 7, c("Species", "Sepal.Length"))
# Comparison against base
x <- rnorm(10^4)
mark(x[1:10^3], sset(x, 1:10^3))
mark(x[x > 0], sset(x, x > 0))
df <- data.frame(x = x)
mark(df[df$x > 0, , drop = FALSE],
sset(df, df$x > 0),
check = FALSE) # Row names are different
## EXTRA: An easy way to incorporate cheapr into dplyr's filter()
# cheapr_filter <- function(.data, ..., .by = NULL, .preserve = FALSE){
# filter_df <- .data |>
# dplyr::mutate(..., .by = {{ .by }}, .keep = "none")
# groups <- dplyr::group_vars(filter_df)
# filter_df <- cheapr::sset(filter_df, j = setdiff(names(filter_df), groups))
# n_filters <- ncol(filter_df)
# if (n_filters < 1){
# .data
# } else {
# dplyr::dplyr_row_slice(.data, cheapr::which_(Reduce(`&`, filter_df)),
# preserve = .preserve)
# }
# }