R: Calculate summaries from overlapping intervals.

bed_map {valr}

R Documentation

Calculate summaries from overlapping intervals.

Description

Apply functions like min() and count() to intersecting intervals. bed_map() uses bed_intersect() to identify intersecting intervals, so output columns will be suffixed with .x and .y. Expressions that refer to input columns from x and y columns must take these suffixes into account.

Usage

bed_map(x, y, ..., min_overlap = 1)

concat(.data, sep = ",")

values_unique(.data, sep = ",")

values(.data, sep = ",")

Arguments

`x`	ivl_df
`y`	ivl_df
`...`	name-value pairs specifying column names and expressions to apply
`min_overlap`	minimum overlap for intervals.
`.data`	data
`sep`	separator character

Details

Book-ended intervals can be included by setting min_overlap = 0.

Non-intersecting intervals from x are included in the result with NA values.

input tbls are grouped by chrom by default, and additional groups can be added using dplyr::group_by(). For example, grouping by strand will constrain analyses to the same strand. To compare opposing strands across two tbls, strands on the y tbl can first be inverted using flip_strands().

Value

ivl_df

Examples

x <- tibble::tribble(
 ~chrom, ~start, ~end,
 'chr1', 100,    250,
 'chr2', 250,    500
)

y <- tibble::tribble(
 ~chrom, ~start, ~end, ~value,
 'chr1', 100,    250,  10,
 'chr1', 150,    250,  20,
 'chr2', 250,    500,  500
)

bed_glyph(bed_map(x, y, value = sum(value)), label = 'value')

# summary examples
bed_map(x, y, .sum = sum(value))

bed_map(x, y, .min = min(value), .max = max(value))

# identify non-intersecting intervals to include in the result
res <- bed_map(x, y, .sum = sum(value))
x_not <- bed_intersect(x, y, invert = TRUE)
dplyr::bind_rows(res, x_not)

# create a list-column
bed_map(x, y, .values = list(value))

# use `nth` family from dplyr
bed_map(x, y, .first = dplyr::first(value))

bed_map(x, y, .absmax = abs(max(value)))

bed_map(x, y, .count = length(value))

bed_map(x, y, .vals = values(value))

# count defaults are NA not 0; differs from bedtools2 ...
bed_map(x, y, .counts = dplyr::n())

# ... but NA counts can be coverted to 0's
dplyr::mutate(bed_map(x, y, .counts = dplyr::n()), .counts = ifelse(is.na(.counts), 0, .counts))

[Package valr version 0.8.1 Index]