R: Subset rows in nested data frames using column values.

nest_filter {nplyr}

R Documentation

Subset rows in nested data frames using column values.

Description

nest_filter() is used to subset nested data frames, retaining all rows that satisfy your conditions. To be retained, the row must produce a value of TRUE for all conditions. Note that when a condition evaluates to NA the row will be dropped, unlike base subsetting with [.

nest_filter() subsets the rows within .nest_data, applying the expressions in ... to the column values to determine which rows should be retained. It can be applied to both grouped and ungrouped data.

Usage

nest_filter(.data, .nest_data, ..., .preserve = FALSE)

Arguments

`.data`	A data frame, data frame extension (e.g., a tibble), or a lazy data frame (e.g., from dbplyr or dtplyr).
`.nest_data`	A list-column containing data frames
`...`	Expressions that return a logical value, and are defined in terms of the variables in `.nest_data`. If multiple expressions are included, they are combined with the `&` operator. Only rows for which all conditions evaluate to `TRUE` are kept.
`.preserve`	Relevant when `.nest_data` is grouped. If `.preserve = FALSE` (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

Details

nest_filter() is largely a wrapper for dplyr::filter() and maintains the functionality of filter() within each nested data frame. For more information on filter(), please refer to the documentation in dplyr.

Value

An object of the same type as .data. Each object in the column .nest_data will also be of the same type as the input. Each object in .nest_data has the following properties:

Rows are a subset of the input, but appear in the same order.
Columns are not modified.
The number of groups may be reduced (if .preserve is not TRUE).
Data frame attributes are preserved.

Examples

gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)

# apply a filter
gm_nest %>%
  nest_filter(country_data, year > 1972)

# apply multiple filters
gm_nest %>%
  nest_filter(country_data, year > 1972, pop < 10000000)
  
# apply a filter on grouped data
gm_nest %>%
  nest_group_by(country_data, country) %>%
  nest_filter(country_data, pop > mean(pop))

[Package nplyr version 0.2.0 Index]