filter_column {matrixset} | R Documentation |
Subset columns using annotation values
Description
The filter_column()
function subsets the columns of all matrices of a
matrixset
, retaining all columns that satisfy given condition(s). The
function filter_column
works like dplyr
's dplyr::filter()
.
Usage
filter_column(.ms, ..., .preserve = FALSE)
Arguments
.ms |
|
... |
Condition, or expression, that returns a logical value,
used to determine if columns are kept or discarded. The
expression may refer to column annotations - columns of
the |
.preserve |
|
Details
The conditions are given as expressions in ...
, which are applied to
columns of the annotation data frame (column_info
) to determine which
columns should be retained.
It can be applied to both grouped and ungrouped matrixset
(see
column_group_by()
), and section ‘Grouped matrixsets’.
Value
A matrixset
, with possibly a subset of the columns of the original object.
Groups will be updated if .preserve
is TRUE
.
Grouped matrixsets
Row grouping (row_group_by()
) has no impact on column filtering.
The impact of column grouping (column_group_by()
) on column filtering
depends on the conditions. Often, column grouping will not have any impact,
but as soon as an aggregating, lagging or ranking function is involved, then
the results will differ.
For instance, the two following are not equivalent (except by pure coincidence).
student_results %>% filter_column(school_average > mean(school_average))
And it's grouped equivalent:
student_results %>% column_group_by(program) %>% filter_column(school_average > mean(school_average))
In the ungrouped version, the mean of school_average
is taken globally
and filter_column
keeps columns with school_average
greater than this
global average. In the grouped version, the average is calculated within each
class
and the kept columns are the ones with school_average
greater
than the within-class average.
Examples
# Filtering using one condition
filter_column(student_results, program == "Applied Science")
# Filetring using multiple conditions. These are equivalent
filter_column(student_results, program == "Applied Science" & school_average > 0.8)
filter_column(student_results, program == "Applied Science", school_average > 0.8)
# The potential difference between grouped and non-grouped.
filter_column(student_results, school_average > mean(school_average))
student_results |>
column_group_by(program) |>
filter_column(school_average > mean(school_average))