thicken {padr} | R Documentation |
Add a variable of a higher interval to a data frame
Description
Take the datetime variable in a data frame and map this to a variable of a higher interval. The mapping is added to the data frame in a new variable.
Usage
thicken(
x,
interval,
colname = NULL,
rounding = c("down", "up"),
by = NULL,
start_val = NULL,
drop = FALSE,
ties_to_earlier = FALSE
)
Arguments
x |
A data frame containing at least one datetime variable of
class |
interval |
The interval of the added datetime variable.
Any character string that would be accepted by |
colname |
The column name of the added variable. If |
rounding |
Should a value in the input datetime variable be mapped to
the closest value that is lower ( |
by |
Only needs to be specified when |
start_val |
By default the first instance of |
drop |
Should the original datetime variable be dropped from the
returned data frame? Defaults to |
ties_to_earlier |
By default when the original datetime observations is
tied with a value in the added datetime variable, it is assigned to the
current value when rounding is down or to the next value when rounding
is up. When |
Details
When the datetime variable contains missing values, they are left in place in the dataframe. The added column with the new datetime variable, will have a missing values for these rows as well.
See vignette("padr")
for more information on thicken
.
See vignette("padr_implementation")
for detailed information on
daylight savings time, different timezones, and the implementation of
thicken
.
Value
The data frame x
with the variable added to it.
Examples
x_hour <- seq(lubridate::ymd_hms('20160302 000000'), by = 'hour',
length.out = 200)
some_df <- data.frame(x_hour = x_hour)
thicken(some_df, 'week')
thicken(some_df, 'month')
thicken(some_df, 'day', start_val = lubridate::ymd_hms('20160301 120000'))
library(dplyr)
x_df <- data.frame(
x = seq(lubridate::ymd(20130101), by = 'day', length.out = 1000) %>%
sample(500),
y = runif(500, 10, 50) %>% round) %>%
arrange(x)
# get the max per month
x_df %>% thicken('month') %>% group_by(x_month) %>%
summarise(y_max = max(y))
# get the average per week, but you want your week to start on Mondays
# instead of Sundays
x_df %>% thicken('week',
start_val = closest_weekday(x_df$x, 2)) %>%
group_by(x_week) %>% summarise(y_avg = mean(y))
# rounding up instead of down
x <- data.frame(dt = lubridate::ymd_hms('20171021 160000',
'20171021 163100'))
thicken(x, interval = "hour", rounding = "up")
thicken(x, interval = "hour", rounding = "up", ties_to_earlier = TRUE)