time_summarise {timeplyr}R Documentation

A time based extension to dplyr::summarise()

Description

This works much the same as dplyr::summarise(), except that you can supply an additional time argument to allow for aggregating time to a higher unit.

Usage

time_summarise(
  data,
  time = NULL,
  ...,
  time_by = NULL,
  from = NULL,
  to = NULL,
  time_type = getOption("timeplyr.time_type", "auto"),
  include_interval = FALSE,
  .by = NULL,
  time_floor = FALSE,
  week_start = getOption("lubridate.week.start", 1),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "boundary"),
  sort = TRUE
)

Arguments

data

A data frame.

time

Time variable.

...

Additional variables to include.

time_by

Time unit.
Must be one of the three:

  • string, specifying either the unit or the number and unit, e.g time_by = "days" or time_by = "2 weeks"

  • named list of length one, the unit being the name, and the number the value of the list, e.g. list("days" = 7). For the vectorized time functions, you can supply multiple values, e.g. list("days" = 1:10).

  • Numeric vector. If time_by is a numeric vector and x is not a date/datetime, then arithmetic is used, e.g time_by = 1.

from

Time series start date.

to

Time series end date.

time_type

If "auto", periods are used for the time expansion when days, weeks, months or years are specified, and durations are used otherwise.

include_interval

Logical. If TRUE then a column "interval" of the form ⁠time_min <= x < time_max⁠ is added showing the time interval in which the respective counts belong to. The rightmost interval will always be closed.

.by

(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.

time_floor

Should from be floored to the nearest unit specified through the time_by argument? This is particularly useful for starting sequences at the beginning of a week or month for example.

week_start

day on which week starts following ISO conventions - 1 means Monday, 7 means Sunday (default). This is only used when time_floor = TRUE.

roll_month

Control how impossible dates are handled when month or year arithmetic is involved. Options are "preday", "boundary", "postday", "full" and "NA". See ?timechange::time_add for more details.

roll_dst

See ?timechange::time_add for the full list of details.

sort

Should the result be sorted? Default is TRUE. If FALSE then original (input) order is kept. The sorting only applies to groups and time variable.

Value

A summarised data.frame.

Examples

library(timeplyr)
library(dplyr)
library(lubridate)
library(nycflights13)

# Works the same way as summarise()
# Monthly average arrival time
flights %>%
  mutate(date = as_date(time_hour)) %>%
  time_summarise(mean_arr_time = mean(arr_time, na.rm = TRUE),
                 time = date,
                 time_by = "month",
                 include_interval = TRUE)
# Example of monthly summary using zoo's yearmon

flights %>%
  mutate(yearmon = zoo::as.yearmon(as_date(time_hour))) %>%
  time_summarise(time = yearmon,
                 n = n(),
                 mean_arr_time = mean(arr_time, na.rm = TRUE),
                 include_interval = TRUE)



[Package timeplyr version 0.5.0 Index]