episodes_wf_splits {diyar}R Documentation

Link events to chronological episodes.


episodes_wf_splits is a wrapper function of episodes. It's designed to be more efficient with larger datasets. Duplicate records which do not affect the case definition are excluded prior to episode tracking. The resulting episode identifiers are then recycled for the duplicate records.


episodes_wf_splits(..., duplicates_recovered = "ANY", reframe = FALSE)



Arguments passed to episodes.


[character]. Determines which duplicate records are recycled. Options are "ANY" (default), "without_sub_criteria", "with_sub_criteria" or "ALL". See Details.


[logical]. Determines if the duplicate records in a sub_criteria are reframed (TRUE) or excluded (FALSE).


episodes_wf_splits() reduces or re-frames a dataset to the minimum datasets required to implement a case definition. This leads to the same outcome but with the benefit of a shorter processing time.

The duplicates_recovered argument determines which identifiers are recycled. Selecting the "with_sub_criteria" option will force only identifiers created resulting from a matched sub_criteria ("Case_CR" and "Recurrent_CR") are recycled. However, if "without_sub_criteria" is selected then only identifiers created that do not result from a matched sub_criteria ("Case" and "Recurrent") are recycled Excluded duplicates of "Duplicate_C" and "Duplicate_R" are always recycled.

The reframe argument will either reframe or subset a sub_criteria. Both will require slightly different functions for match_funcs or equal_funcs.


epid; list

See Also

episodes; sub_criteria


# With 2,000 duplicate records of 20 events,
# `episodes_wf_splits()` will take less time than `episodes()`
dates <- seq(from = as.Date("2019-04-01"), to = as.Date("2019-04-20"), by = 1)
dates <- rep(dates, 2000)

  ep1 <- episodes(dates, 1)
  ep2 <- episodes_wf_splits(dates, 1)

# Both leads to the same outcome.
all(ep1 == ep2)

[Package diyar version 0.5.1 Index]