bundle {constellation} | R Documentation |
Identify bundle items that occur around a given event
Description
A function that reads in a data frame of incident events along with multiple time series data frames of bundle items and calculates whether or not each bundle item occurs within a defined time window around the incident event. The user must provide names for each bundle item, define the time window around the incident events to consider, a name for the incident event, and variables to use to join the different tables. Lastly, the user can specify whether to return all instances that each bundle item occurs around the incident event, or whether to pull only the first or last instant for each bundle item. All time series data frames must contain columns for joining the tables ( join_key) and time stamps (time_var). The time_var column must be class POSIXct in all the data frames. This function can ingest an arbitrary number of data frames for different bundle items around an incident event.
Usage
bundle(events, ..., bundle_names, window_hours_pre, window_hours_post, join_key,
time_var, event_name, mult = c("all", "first", "last"))
Arguments
events |
A time series data frame of incident events. The bundle events are searched for around a given time window of these events. The events data frame must include the columns 'join_key' and 'time_var' |
... |
An arbitrary number of time series data frames that each include the columns 'join_key' and 'time_var'. Each data frame consists of a bundle item that is important to find around the specified events. |
bundle_names |
A vector of strings specifying the name of each event. The order of strings in the vector should align with the order of data frames passed in '...'. |
window_hours_pre |
A single numeric or vector of numerics speficying . the number of hours before the events in the events data frame that each bundle item is considered relevant. If a single numeric is passed, that time window before the events is applied to all bundle items. |
window_hours_post |
A single numeric or vector of numerics speficying . the number of hours after the events in the events data frame that each bundle item is considered relevant. If a single numeric is passed, that time window after the events is applied to all bundle items. |
join_key |
A string name of the column to join all time series data frames |
time_var |
A string name of the time stamp column in all time series data frames. The class of time_var must be POSIXct in all data frames. |
event_name |
A string name of the events in the events data frame |
mult |
A string specifying whether to return the first, last, or all instance(s) of every bundle item occurring within the specified time window of events. The default value is all. |
Value
A data.frame, data.table with a time stamp for every event of interest, columns for the start and end of the time window of interest, and columns for every bundle item. The value in bundle item columns is the timestamp (time_var) that the bundle item is observed within the given window.
Imported functions
foverlaps() from data.table and general data.table syntax
Errors
This function returns errors for:
missing arguments (only the mult argument has a default value)
passing arguments with invalid classes (events and bundle items must be data frames, bundle_names must be a string, window_hours_pre and window_hours_post must be numerics, and event_name must be a string)
passing an invalid mult value
passing join_key or time_var values that are not column names in all time series data frames
passing an invalid number of window_hours_pre or window_hours_post values (1 or the number of bundle data frames).
Examples
library(data.table)
temp <- as.data.table(vitals[VARIABLE == "TEMPERATURE"])
pulse <- as.data.table(vitals[VARIABLE == "PULSE"])
resp <- as.data.table(vitals[VARIABLE == "RESPIRATORY_RATE"])
temp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
pulse[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
resp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
# Pass single window_hours_pre
# All instances of bundle items within time window of event
bundle(temp, pulse, resp,
bundle_names = c("PLATELETS", "INR"), window_hours_pre = 24,
window_hours_post = c(6, 6), join_key = "PAT_ID",
time_var = "RECORDED_TIME", event_name = "CREATININE", mult = "all")
# Pass different window_hours_pre for each bundle time series data frame
# All instances of bundle items within time window of event
bundle(temp, pulse, resp,
bundle_names = c("PLATELETS", "INR"), window_hours_pre = c(24, 12),
window_hours_post = c(6, 6), join_key = "PAT_ID",
time_var = "RECORDED_TIME", event_name = "CREATININE", mult = "all")
# Pass different window_hours_pre for each bundle time series data frame
# First instance of each bundle item within time window of event
bundle(temp, pulse, resp,
bundle_names = c("PLATELETS", "INR"), window_hours_pre = c(24, 12),
window_hours_post = c(6, 6), join_key = "PAT_ID",
time_var = "RECORDED_TIME", event_name = "CREATININE", mult = "first")
# Pass different window_hours_pre for each bundle time series data frame
# Last instance of each bundle item within time window of event
bundle(temp, pulse, resp,
bundle_names = c("PLATELETS", "INR"), window_hours_pre = c(24, 12),
window_hours_post = c(6, 6), join_key = "PAT_ID",
time_var = "RECORDED_TIME", event_name = "CREATININE", mult = "last")