| time_roll_sum {timeplyr} | R Documentation | 
Fast time-based by-group rolling sum/mean - Currently experimental
Description
time_roll_sum and time_roll_mean are efficient
methods for calculating a rolling sum and mean respectively given
many groups and with respect to a date or datetime time index. 
It is always aligned "right". 
time_roll_window splits x into windows based on the index. 
time_roll_window_size returns the window sizes for all indices of x. 
time_roll_apply is a generic function that applies any function
on a rolling basis with respect to a time index. 
time_roll_growth_rate can efficiently calculate by-group
rolling growth rates with respect to a date/datetime index.
Usage
time_roll_sum(
  x,
  window = Inf,
  time = seq_along(x),
  weights = NULL,
  g = NULL,
  partial = TRUE,
  close_left_boundary = FALSE,
  na.rm = TRUE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA"),
  ...
)
time_roll_mean(
  x,
  window = Inf,
  time = seq_along(x),
  weights = NULL,
  g = NULL,
  partial = TRUE,
  close_left_boundary = FALSE,
  na.rm = TRUE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA"),
  ...
)
time_roll_growth_rate(
  x,
  window = Inf,
  time = seq_along(x),
  time_step = NULL,
  g = NULL,
  partial = TRUE,
  close_left_boundary = FALSE,
  na.rm = TRUE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)
time_roll_window_size(
  time,
  window = Inf,
  g = NULL,
  partial = TRUE,
  close_left_boundary = FALSE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)
time_roll_window(
  x,
  window = Inf,
  time = seq_along(x),
  g = NULL,
  partial = TRUE,
  close_left_boundary = FALSE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)
time_roll_apply(
  x,
  window = Inf,
  fun,
  time = seq_along(x),
  g = NULL,
  partial = TRUE,
  unlist = FALSE,
  close_left_boundary = FALSE,
  time_type = getOption("timeplyr.time_type", "auto"),
  roll_month = getOption("timeplyr.roll_month", "preday"),
  roll_dst = getOption("timeplyr.roll_dst", "NA")
)
Arguments
| x | Numeric vector. | 
| window | Time window size (Default is  
 | 
| time | (Optional) time index.  | 
| weights | Importance weights. Must be the same length as x. Currently, no normalisation of weights occurs. | 
| g | Grouping object passed directly to  | 
| partial | Should calculations be done using partial windows?
Default is  | 
| close_left_boundary | Should the left boundary be closed?
For example, if you specify  | 
| na.rm | Should missing values be removed for the calculation?
The default is  | 
| time_type | If "auto",  | 
| roll_month | Control how impossible dates are handled when
month or year arithmetic is involved.
Options are "preday", "boundary", "postday", "full" and "NA".
See  | 
| roll_dst | See  | 
| ... | Additional arguments passed to  | 
| time_step | An optional but important argument
that follows the same input rules as  | 
| fun | A function. | 
| unlist | Should the output of  | 
Details
It is much faster if your data are already sorted such that
!is.unsorted(order(g, x)) is TRUE.
Growth rates
For growth rates across time, one can use time_step to incorporate
gaps in time into the calculation.
For example: 
x <- c(10, 20) 
t <- c(1, 10) 
k <- Inf
time_roll_growth_rate(x, time = t, window = k) = c(1, 2)
whereas 
time_roll_growth_rate(x, time = t, window = k, time_step = 1) = c(1, 1.08) 
The first is a doubling from 10 to 20, whereas the second implies a growth of
8% for each time step from 1 to 10. 
This allows us for example to calculate daily growth rates over the last x months,
even with missing days.
Value
A vector the same length as time.
Examples
library(timeplyr)
library(lubridate)
library(dplyr)
time <- time_seq(today(), today() + weeks(3),
                 time_by = "3 days")
set.seed(99)
x <- sample.int(length(time))
roll_mean(x, window = 7)
roll_sum(x, window = 7)
time_roll_mean(x, window = ddays(7), time = time)
time_roll_sum(x, window = days(7), time = time)
# Alternatively and more verbosely
x_chunks <- time_roll_window(x, window = 7, time = time)
x_chunks
vapply(x_chunks, mean, 0)
# Interval (x - 3 x]
time_roll_sum(x, window = ddays(3), time = time)
# An example with an irregular time series
t <- today() + days(sort(sample(1:30, 20, TRUE)))
time_elapsed(t, days(1)) # See the irregular elapsed time
x <- rpois(length(t), 10)
tibble(x, t) %>%
  mutate(sum = time_roll_sum(x, time = t, window = days(3))) %>%
  time_ggplot(t, sum)
### Rolling mean example with many time series
# Sparse time with duplicates
index <- sort(sample(seq(now(), now() + dyears(3), by = "333 hours"),
                     250, TRUE))
x <- matrix(rnorm(length(index) * 10^3),
            ncol = 10^3, nrow = length(index),
            byrow = FALSE)
zoo_ts <- zoo::zoo(x, order.by = index)
# Normally you might attempt something like this
apply(x, 2,
      function(x){
        time_roll_mean(x, window = dmonths(1), time = index)
      }
)
# Unfortunately this is too slow and inefficient
# Instead we can pivot it longer and code each series as a separate group
tbl <- ts_as_tibble(zoo_ts)
tbl %>%
  mutate(monthly_mean = time_roll_mean(value, window = dmonths(1),
                                       time = time, g = group))