time_aggregate {timeplyr}R Documentation

Aggregate time to a higher unit

Description

Aggregate time to a higher unit for possibly many groups with respect to a time index.

Usage

time_aggregate(
  x,
  time_by = NULL,
  g = NULL,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "boundary"),
  direction = c("l2r", "r2l")
)

Arguments

x

Time vector.
Can be a Date, POSIXt, numeric, integer, yearmon, or yearqtr vector.

time_by

Time unit.
Must be one of the following:

  • string, e.g time_by = "day" or time_by = "2 weeks"

  • lubridate duration or period object, e.g. days(1) or ddays(1).

  • named list of length one, e.g. list("days" = 7).

  • Numeric vector, e.g. time_by = 7.

g

Grouping object passed directly to collapse::GRP(). This can for example be a vector or data frame.

time_type

If "auto", periods are used for the time expansion when days, weeks, months or years are specified, and durations are used otherwise.

roll_month

Control how impossible dates are handled when month or year arithmetic is involved.

roll_dst

See ?timechange::time_add for the full list of details.

direction

Direction with which to aggregate time, "l2r" ("left-to-right") or "r2l" ("right-to-left"). If "l2r" (the default), then the minimum time is used as the reference time, otherwise the maximum time is used.

Details

time_aggregate aggregates time using distinct moving time range blocks of a specified time unit.

The actual calculation is extremely simple and essentially requires a subtraction, a rounding and an addition.

If for example time_by = "week" then all dates or datetimes will be shifted backwards (or forwards if direction is "r2l") to the nearest start of the week, where the start of week is based on min(x). This is identical to building a weekly sequence and using this as breakpoints to cut x. No time expansion occurs so this is very efficient except when periods are used and there is a lot of data. In this case, provided the expansion is not too big, it may be more efficient to cut the data using the period sequence which can be achieved using time_summarisev.

Value

A time aggregated vector the same class and length as x.

See Also

time_summarisev

Examples

library(timeplyr)
library(nycflights13)
library(lubridate)
library(dplyr)

sunique <- function(x) sort(unique(x))

hours <- sunique(flights$time_hour)
days <- as_date(hours)

# Aggregate by week or any time unit easily
unique(time_aggregate(hours, "week"))
unique(time_aggregate(hours, ddays(14)))
unique(time_aggregate(hours, "month"))
unique(time_aggregate(days, "month"))

# Left aligned
unique(time_aggregate(days, "quarter"))
# Right aligned
unique(time_aggregate(days, "quarter", direction = "r2l"))

# Very fast by group aggregation
week_by_tailnum <- time_aggregate(flights$time_hour, time_by = ddays(7),
                                  g = flights$tailnum)
# Confirm this has been done by group as each group will have a
# Different aggregate start date
flights %>%
  mutate(week_by_tailnum) %>%
  stat_summarise(week_by_tailnum, .by = tailnum, stat = "min",
                 sort = FALSE)


[Package timeplyr version 0.5.0 Index]