time_by {timeplyr}R Documentation

Group by a time variable at a higher time unit

Description

time_by groups a time variable by a specified time unit like for example "days" or "weeks".
It can be used exactly like dplyr::group_by.

Usage

time_by(
  data,
  time,
  time_by = NULL,
  from = NULL,
  to = NULL,
  .name = paste0("time_intv_", time_by_pretty(time_by, "_")),
  .add = FALSE,
  time_type = getOption("timeplyr.time_type", "auto"),
  as_interval = getOption("timeplyr.use_intervals", FALSE),
  .time_by_group = TRUE
)

time_by_span(x)

time_by_var(x)

time_by_units(x)

Arguments

data

A data frame.

time

Time variable (data-masking).
Can be a Date, POSIXt, numeric, integer, yearmon, or yearqtr.

time_by

Time unit.
Must be one of the following:

  • string, specifying either the unit or the number and unit, e.g time_by = "days" or time_by = "2 weeks"

  • lubridate duration or period object, e.g. days(1) or ddays(1).

  • named list of length one, the unit being the name, and the number the value of the list, e.g. list("days" = 7). For the vectorized time functions, you can supply multiple values, e.g. list("days" = 1:10).

  • Numeric vector. If time_by is a numeric vector and x is not a date/datetime, then arithmetic is used, e.g time_by = 1.

from

(Optional) Start time.

to

(Optional) end time.

.name

An optional glue specification passed to stringr::glue() which can be used to concatenate strings to the time column name or replace it.

.add

Should the time groups be added to existing groups? Default is FALSE.

time_type

If "auto", periods are used for the time aggregation when days, weeks, months or years are specified, and durations are used otherwise. If durations are used the output is always of class POSIXct.

as_interval

Should time variable be a time_interval? Default is FALSE.
This can be controlled globally through options(timeplyr.use_intervals).

.time_by_group

Should the time aggregations be built on a group-by-group basis (the default), or should the time variable be aggregated using the full data? If done by group, different groups may contain different time sequences. This only applies when .add = TRUE.

x

A time_tbl_df.

Value

A time_tbl_df which for practical purposes can be treated the same way as a dplyr grouped_df.

Examples

library(dplyr)
library(timeplyr)
library(nycflights13)
library(lubridate)


# Basic usage
hourly_flights <- flights %>%
  time_by(time_hour) # Detects time granularity

hourly_flights
time_by_span(hourly_flights)

monthly_flights <- flights %>%
  time_by(time_hour, "month")
weekly_flights <- flights %>%
  time_by(time_hour, "week", from = floor_date(min(time_hour), "week"))

monthly_flights %>%
  count()

weekly_flights %>%
  summarise(n = n(), arr_delay = mean(arr_delay, na.rm = TRUE))

# To aggregate multiple variables, use time_aggregate

flights %>%
  select(time_hour) %>%
  mutate(across(everything(), \(x) time_aggregate(x, time_by = "weeks"))) %>%
  count(time_hour)


[Package timeplyr version 0.8.1 Index]