R: Cheaper subset

sset {cheapr}

R Documentation

Cheaper subset

Description

Cheaper alternative to [ that consistently subsets data frame rows, always returning a data frame. There are explicit methods for enhanced data frames like tibbles, data.tables and sf.

Usage

sset(x, ...)

## S3 method for class 'Date'
sset(x, i, ...)

## S3 method for class 'POSIXct'
sset(x, i, ...)

## S3 method for class 'factor'
sset(x, i, ...)

## S3 method for class 'data.frame'
sset(x, i, j, ...)

## S3 method for class 'tbl_df'
sset(x, i, j, ...)

## S3 method for class 'POSIXlt'
sset(x, i, j, ...)

## S3 method for class 'data.table'
sset(x, i, j, ...)

## S3 method for class 'sf'
sset(x, i, j, ...)

Arguments

`x`	Vector or data frame.
`...`	Further parameters passed to `[`.
`i`	A logical or vector of indices.
`j`	Column indices, names or logical vector.

Details

sset is an S3 generic. You can either write methods for sset or [.
sset will fall back on using [ when no suitable method is found.

To get into more detail, using sset() on a data frame, a new list is always allocated through new_list().

Difference to base R

When i is a logical vector, it is passed directly to which_().
This means that NA values are ignored and this also means that i is not recycled, so it is good practice to make sure the logical vector matches the length of x. To return NA values, use sset(x, NA_integer_).

ALTREP range subsetting

When i is an ALTREP compact sequence which can be commonly created using e.g. 1:10 or using seq_len, seq_along and seq.int, sset internally uses a range-based subsetting method which is faster and doesn't allocate i into memory.

Value

A new vector, data frame, list, matrix or other R object.

Examples

library(cheapr)
library(bench)

# Selecting columns
sset(airquality, j = "Temp")
sset(airquality, j = 1:2)

# Selecting rows
sset(iris, 1:5)

# Rows and columns
sset(iris, 1:5, 1:5)
sset(iris, iris$Sepal.Length > 7, c("Species", "Sepal.Length"))

# Comparison against base
x <- rnorm(10^4)

mark(x[1:10^3], sset(x, 1:10^3))
mark(x[x > 0], sset(x, x > 0))

df <- data.frame(x = x)

mark(df[df$x > 0, , drop = FALSE],
     sset(df, df$x > 0),
     check = FALSE) # Row names are different


## EXTRA: An easy way to incorporate cheapr into dplyr's filter()
# cheapr_filter <- function(.data, ..., .by = NULL, .preserve = FALSE){
#   filter_df <- .data |>
#     dplyr::mutate(..., .by = {{ .by }}, .keep = "none")
#   groups <- dplyr::group_vars(filter_df)
#   filter_df <- cheapr::sset(filter_df, j = setdiff(names(filter_df), groups))
#   n_filters <- ncol(filter_df)
#   if (n_filters < 1){
#     .data
#   } else {
#     dplyr::dplyr_row_slice(.data, cheapr::which_(Reduce(`&`, filter_df)),
#                            preserve = .preserve)
#   }
# }

[Package cheapr version 0.9.3 Index]