| iv-groups {ivs} | R Documentation |
Group overlapping intervals
Description
This family of functions revolves around grouping overlapping intervals
within a single iv. When multiple overlapping intervals are grouped together
they result in a wider interval containing the smallest iv_start() and the
largest iv_end() of the overlaps.
-
iv_groups()merges all overlapping intervals found withinx. The resulting intervals are known as the "groups" ofx. -
iv_identify_group()identifies the group that the current interval ofxfalls in. This is particularly useful alongsidedplyr::group_by(). -
iv_locate_groups()returns a two column data frame with akeycolumn containing the result ofiv_groups()and aloclist-column containing integer vectors that map each interval inxto the group that it falls in.
Optionally, you can choose not to group abutting intervals together with
abutting = FALSE, which can be useful if you'd like to retain those
boundaries.
Minimal interval vectors
iv_groups() is particularly useful because it can generate a minimal
interval vector, which covers the range of an interval vector in the most
compact form possible. In particular, a minimal interval vector:
Has no overlapping intervals
Has no abutting intervals
Is ordered on both
startandend
A minimal interval vector is allowed to have a single missing interval, which is located at the end of the vector.
Usage
iv_groups(x, ..., abutting = TRUE)
iv_identify_group(x, ..., abutting = TRUE)
iv_locate_groups(x, ..., abutting = TRUE)
Arguments
x |
An interval vector. |
... |
These dots are for future extensions and must be empty. |
abutting |
Should abutting intervals be grouped together? If |
Value
For
iv_groups(), an iv with the same type asx.For
iv_identify_group(), an iv with the same type and size asx.For
iv_locate_groups(), a two column data frame with akeycolumn containing the result ofiv_groups()and aloclist-column containing integer vectors.
Graphical Representation
Graphically, generating groups looks like:
With abutting = FALSE, intervals that touch aren't grouped:
Examples
library(dplyr, warn.conflicts = FALSE)
x <- iv_pairs(
c(1, 5),
c(2, 3),
c(NA, NA),
c(5, 6),
c(NA, NA),
c(9, 12),
c(11, 14)
)
x
# Grouping removes all redundancy while still covering the full range
# of values that were originally represented. If any missing intervals
# are present, a single one is retained.
iv_groups(x)
# Abutting intervals are typically grouped together, but you can choose not
# to group them if you want to retain those boundaries
iv_groups(x, abutting = FALSE)
# `iv_identify_group()` is useful alongside `group_by()` and `summarize()`
df <- tibble(x = x)
df <- mutate(df, u = iv_identify_group(x))
df
df %>%
group_by(u) %>%
summarize(n = n())
# The real workhorse here is `iv_locate_groups()`, which returns
# the groups and information on which observations in `x` fall in which
# group
iv_locate_groups(x)