group_dt {tidyfst} | R Documentation |
Data manipulation within groups
Description
Carry out data manipulation within specified groups.
Usage
group_dt(.data, by = NULL, ...)
rowwise_dt(.data, ...)
Arguments
.data |
A data.frame |
by |
Variables to group by,unquoted name of grouping variable of list of unquoted names of grouping variables. |
... |
Any data manipulation arguments that could be implemented on a data.frame. |
Details
If you want to use summarise_dt
and mutate_dt
in
group_dt
, it is better to use the "by" parameter in those functions,
that would be much faster because you don't have to use .SD
(which takes
extra time to copy).
Value
data.table
References
https://stackoverflow.com/questions/36802385/use-by-each-row-for-data-table
Examples
iris %>% group_dt(by = Species,slice_dt(1:2))
iris %>% group_dt(Species,filter_dt(Sepal.Length == max(Sepal.Length)))
iris %>% group_dt(Species,summarise_dt(new = max(Sepal.Length)))
# you can pipe in the `group_dt`
iris %>% group_dt(Species,
mutate_dt(max= max(Sepal.Length)) %>%
summarise_dt(sum=sum(Sepal.Length)))
# for users familiar with data.table, you can work on .SD directly
# following codes get the first and last row from each group
iris %>%
group_dt(
by = Species,
rbind(.SD[1],.SD[.N])
)
#' # for summarise_dt, you can use "by" to calculate within the group
mtcars %>%
summarise_dt(
disp = mean(disp),
hp = mean(hp),
by = cyl
)
# but you could also, of course, use group_dt
mtcars %>%
group_dt(by =.(vs,am),
summarise_dt(avg = mean(mpg)))
# and list of variables could also be used
mtcars %>%
group_dt(by =list(vs,am),
summarise_dt(avg = mean(mpg)))
# examples for `rowwise_dt`
df <- data.table(x = 1:2, y = 3:4, z = 4:5)
df %>% mutate_dt(m = mean(c(x, y, z)))
df %>% rowwise_dt(
mutate_dt(m = mean(c(x, y, z)))
)
[Package tidyfst version 1.7.9 Index]