iv-groups {ivs}R Documentation

Group overlapping intervals

Description

This family of functions revolves around grouping overlapping intervals within a single iv. When multiple overlapping intervals are grouped together they result in a wider interval containing the smallest iv_start() and the largest iv_end() of the overlaps.

Optionally, you can choose not to group abutting intervals together with abutting = FALSE, which can be useful if you'd like to retain those boundaries.

Minimal interval vectors

iv_groups() is particularly useful because it can generate a minimal interval vector, which covers the range of an interval vector in the most compact form possible. In particular, a minimal interval vector:

A minimal interval vector is allowed to have a single missing interval, which is located at the end of the vector.

Usage

iv_groups(x, ..., abutting = TRUE)

iv_identify_group(x, ..., abutting = TRUE)

iv_locate_groups(x, ..., abutting = TRUE)

Arguments

x

⁠[iv]⁠

An interval vector.

...

These dots are for future extensions and must be empty.

abutting

⁠[TRUE / FALSE]⁠

Should abutting intervals be grouped together?

If TRUE, ⁠[a, b)⁠ and ⁠[b, c)⁠ will merge as ⁠[a, c)⁠. If FALSE, they will be kept separate. To be a minimal interval vector, all abutting intervals must be grouped together.

Value

Graphical Representation

Graphically, generating groups looks like:

groups.png

With abutting = FALSE, intervals that touch aren't grouped:

groups-abutting-keep.png

Examples

library(dplyr, warn.conflicts = FALSE)

x <- iv_pairs(
  c(1, 5),
  c(2, 3),
  c(NA, NA),
  c(5, 6),
  c(NA, NA),
  c(9, 12),
  c(11, 14)
)
x

# Grouping removes all redundancy while still covering the full range
# of values that were originally represented. If any missing intervals
# are present, a single one is retained.
iv_groups(x)

# Abutting intervals are typically grouped together, but you can choose not
# to group them if you want to retain those boundaries
iv_groups(x, abutting = FALSE)

# `iv_identify_group()` is useful alongside `group_by()` and `summarize()`
df <- tibble(x = x)
df <- mutate(df, u = iv_identify_group(x))
df

df %>%
  group_by(u) %>%
  summarize(n = n())

# The real workhorse here is `iv_locate_groups()`, which returns
# the groups and information on which observations in `x` fall in which
# group
iv_locate_groups(x)

[Package ivs version 0.2.0 Index]