lump_rows {tidytidbits} | R Documentation |
Lump rows of a tibble
Description
A verb for a dplyr pipeline:
In the given data frame, take the .level column as a set of levels and the .count column
as corresponding counts. Return a data frame where the rows are lumped according to levels/counts
using the parameters n, prop, other_level, ties.method like for lump()
.
The resulting row for other_level has level=other level
, count=sum(count of all lumped rows)
.
For the remaining columns, either a default concatenation is used, or you can provide
custom summarising statements via the summarising_statements parameter.
Provide a list named by the column you want to summarize, giving statements wrapped in quo(),
using syntax as you would for a call to summarise().
Usage
lump_rows(
.df,
.level,
.count,
summarising_statements = quos(),
n,
prop,
remaining_levels,
other_level = "Other",
ties.method = c("min", "average", "first", "last", "random", "max")
)
Arguments
.df |
A data frame |
.level |
Column name (symbolic) containing a set of levels |
.count |
Column name (symbolic) containing counts of the levels |
summarising_statements |
The "lumped" rows need to have all their columns summarised into one row.
This parameter is a vars() list of arguments as if used in a call to |
n |
If specified, n rows shall be preserved. |
prop |
If specified, rows shall be preserved if their count >= prop |
remaining_levels |
Levels that should explicitly not be lumped |
other_level |
Name of the "other" level to be created from lumped rows |
ties.method |
Method to apply in case of ties |
Value
The lumped data frame