time_seq_id {timeplyr}R Documentation

Generate a unique identifier for a regular time sequence with gaps

Description

A unique identifier is created every time a specified amount of time has passed, or in the case of regular sequences, when there is a gap in time.

Usage

time_seq_id(
  x,
  time_by = NULL,
  threshold = 1,
  g = NULL,
  na_skip = TRUE,
  rolling = TRUE,
  switch_on_boundary = FALSE,
  time_type = getOption("timeplyr.time_type", "auto")
)

Arguments

x

Date, datetime or numeric vector.

time_by

Time unit.
This signifies the granularity of the time data with which to measure gaps in the sequence. If your data is daily for example, supply time_by = "days". If weekly, supply time_by = "week". Must be one of the three:

  • string, specifying either the unit or the number and unit, e.g time_by = "days" or time_by = "2 weeks"

  • named list of length one, the unit being the name, and the number the value of the list, e.g. list("days" = 7). For the vectorized time functions, you can supply multiple values, e.g. list("days" = 1:10).

  • Numeric vector. If time_by is a numeric vector and x is not a date/datetime, then arithmetic is used, e.g time_by = 1.

threshold

Threshold such that when the time elapsed exceeds this, the sequence ID is incremented by 1. For example, if time_by = "days" and threshold = 2, then when 2 days have passed, a new ID is created. Furthermore, threshold generally need not be supplied as
time_by = "3 days" & threshold = 1
is identical to
time_by = "days" & threshold = 3.

g

Object used for grouping x. This can for example be a vector or data frame. g is passed directly to collapse::GRP().

na_skip

Should NA values be skipped? Default is TRUE.

rolling

When this is FALSE, a new ID is created every time a cumulative amount of time has passed. Once that amount of time has passed, a new ID is created, the clock "resets" and we start counting from that point.

switch_on_boundary

When an exact amount of time (specified in time_by) has passed, should there an increment in ID? The default is FALSE. For example, if time_by = "days" and switch_on_boundary = FALSE, > 1 day must have passed, otherwise >= 1 day must have passed.

time_type

If "auto", periods are used for the time expansion when days, weeks, months or years are specified, and durations are used otherwise.

Details

time_seq_id() Assumes x is regular and in ascending or descending order. To check this condition formally, use time_is_regular().

Value

An integer vector of length(x).

Examples

library(dplyr)
library(timeplyr)
library(lubridate)

# Weekly sequence, with 2 gaps in between
x <- time_seq(today(), length.out = 10, time_by = "week")
x <- x[-c(3, 7)]
# A new ID when more than a week has passed since the last time point
time_seq_id(x, time_by = "week")
# A new ID when >= 2 weeks has passed since the last time point
time_seq_id(x, time_by = "weeks", threshold = 2, switch_on_boundary = TRUE)
# A new ID when at least 4 cumulative weeks have passed
time_seq_id(x, time_by = "4 weeks",
            switch_on_boundary = TRUE, rolling = FALSE)
# A new ID when more than 4 cumulative weeks have passed
time_seq_id(x, time_by = "4 weeks",
            switch_on_boundary = FALSE, rolling = FALSE)


[Package timeplyr version 0.8.1 Index]