filter_row {matrixset} | R Documentation |
Subset rows using annotation values
Description
The filter_row()
function subsets the rows of all matrices of a
matrixset
, retaining all rows that satisfy given condition(s). The function
filter_row
works like dplyr
's dplyr::filter()
.
Usage
filter_row(.ms, ..., .preserve = FALSE)
Arguments
.ms |
|
... |
Condition, or expression, that returns a logical value,
used to determine if rows are kept or discarded. The
expression may refer to row annotations - columns of
the |
.preserve |
|
Details
The conditions are given as expressions in ...
, which are applied to
columns of the annotation data frame (row_info
) to determine which rows
should be retained.
It can be applied to both grouped and ungrouped matrixset
(see
row_group_by()
), and section ‘Grouped matrixsets’.
Value
A matrixset
, with possibly a subset of the rows of the original object.
Groups will be updated if .preserve
is TRUE
.
Grouped matrixsets
Column grouping (column_group_by()
) has no impact on row filtering.
The impact of row grouping (row_group_by()
) on row filtering depends on
the conditions. Often, row grouping will not have any impact, but as soon as
an aggregating, lagging or ranking function is involved, then the results
will differ.
For instance, the two following are not equivalent (except by pure coincidence).
student_results %>% filter_row(previous_year_score > mean(previous_year_score))
And it's grouped equivalent:
student_results %>% row_group_by(class) %>% filter_row(previous_year_score > mean(previous_year_score))
In the ungrouped version, the mean of previous_year_score
is taken globally
and filter_row
keeps rows with previous_year_score
greater than this
global average. In the grouped version, the average is calculated within each
class
and the kept rows are the ones with previous_year_score
greater
than the within-class average.
Examples
# Filtering using one condition
filter_row(student_results, class == "classA")
# Filetring using multiple conditions. These are equivalent
filter_row(student_results, class == "classA" & previous_year_score > 0.75)
filter_row(student_results, class == "classA", previous_year_score > 0.75)
# The potential difference between grouped and non-grouped.
filter_row(student_results, previous_year_score > mean(previous_year_score))
student_results |>
row_group_by(teacher) |>
filter_row(previous_year_score > mean(previous_year_score))