time_expand {timeplyr}R Documentation

A time based extension to tidyr::complete().

Description

A time based extension to tidyr::complete().

Usage

time_expand(
  data,
  time = NULL,
  ...,
  .by = NULL,
  time_by = NULL,
  from = NULL,
  to = NULL,
  time_type = getOption("timeplyr.time_type", "auto"),
  time_floor = FALSE,
  week_start = getOption("lubridate.week.start", 1),
  expand_type = c("nesting", "crossing"),
  sort = TRUE,
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)

time_complete(
  data,
  time = NULL,
  ...,
  .by = NULL,
  time_by = NULL,
  from = NULL,
  to = NULL,
  time_type = getOption("timeplyr.time_type", "auto"),
  time_floor = FALSE,
  week_start = getOption("lubridate.week.start", 1),
  expand_type = c("nesting", "crossing"),
  sort = TRUE,
  fill = NA,
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)

Arguments

data

A data frame.

time

Time variable.

...

Groups to expand.

.by

(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.

time_by

Time unit.
Must be one of the three:

  • string, specifying either the unit or the number and unit, e.g time_by = "days" or time_by = "2 weeks"

  • named list of length one, the unit being the name, and the number the value of the list, e.g. list("days" = 7). For the vectorized time functions, you can supply multiple values, e.g. list("days" = 1:10).

  • Numeric vector. If time_by is a numeric vector and x is not a date/datetime, then arithmetic is used, e.g time_by = 1.

from

Time series start date.

to

Time series end date.

time_type

If "auto", periods are used for the time expansion when days, weeks, months or years are specified, and durations are used otherwise.

time_floor

Should from be floored to the nearest unit specified through the time_by argument? This is particularly useful for starting sequences at the beginning of a week or month for example.

week_start

day on which week starts following ISO conventions - 1 means Monday (default), 7 means Sunday. This is only used when floor_date = TRUE.

expand_type

Type of time expansion to use where "nesting" finds combinations already present in the data, "crossing" finds all combinations of values in the group variables.

sort

Logical. If TRUE expanded/completed variables are sorted.

roll_month

Control how impossible dates are handled when month or year arithmetic is involved. Options are "preday", "boundary", "postday", "full" and "NA". See ?timechange::time_add for more details.

roll_dst

See ?timechange::time_add for the full list of details.

fill

A named list containing value-name pairs to fill the named implicit missing values.

Details

This works much the same as tidyr::complete(), except that you can supply an additional time argument to allow for filling in time gaps, expansion of time, as well as aggregating time to a higher unit. lubridate is used for handling time, while data.table and collapse are used for the data frame expansion.

At the moment, within group combinations are ignored. This means when expand_type = nesting, existing combinations of supplied groups across the entire dataset are used, and when expand_type = crossing, all possible combinations of supplied groups across the entire dataset are used as well.

Value

A data.frame of expanded time by or across groups.

Examples

library(timeplyr)
library(dplyr)
library(lubridate)
library(nycflights13)

x <- flights$time_hour

time_num_gaps(x) # Missing hours

flights_count <- flights %>%
  fcount(time_hour)

# Fill in missing hours
flights_count %>%
  time_complete(time = time_hour)

# You can specify units too
flights_count %>%
  time_complete(time = time_hour, time_by = "hours")
flights_count %>%
  time_complete(time = as_date(time_hour), time_by = "days") #  Nothing to complete here

# Where time_expand() and time_complete() really shine is how fast they are with groups
flights %>%
  group_by(origin, dest) %>%
  time_expand(time = time_hour, time_by = dweeks(1))


[Package timeplyr version 0.8.1 Index]