R: Order IDs across two columns

arrange_ids {skater}

R Documentation

Order IDs across two columns

Description

Some types of data or results are indexed by two identifiers in two different columns corresponding to data points for pairs of observations. E.g., you may have columns called id1 and id2 that index the tibble for all possible pairs of results between samples A, B, and C. If you attempt to join two tibbles with by=c("id1", "id2"), the join will fail if samples are flipped from one dataset to another. E.g., one tibble may have id1=A and id2=B while the other has id1=B and id2=A. This function ensures that id1 is alphanumerically first while id2 is alphanumerically second. See examples.

Usage

arrange_ids(.data, .id1, .id2)

Arguments

`.data`	A tibble with two ID columns to arrange.
`.id1`	Unquoted name of the "id1" column. See examples.
`.id2`	Unquoted name of the "id2" column. See examples.

Value

A tibble with id1 and id2 rearranged alphanumerically.

Examples

d1 <- tibble::tribble(
  ~id1, ~id2, ~results1,
  "a",  "b",       10L,
  "a",  "c",       20L,
  "c",  "b",       30L
)
d2 <- tibble::tribble(
  ~id1, ~id2,  ~results2,
  "b",  "a",       101L,
  "c",  "a",       201L,
  "b",  "c",       301L
)
# Inner join fails because id1!=id2.
dplyr::inner_join(d1, d2, by=c("id1", "id2"))
# Arrange IDs
d1 %>% arrange_ids(id1, id2)
d2 %>% arrange_ids(id1, id2)
# Inner join
dplyr::inner_join(arrange_ids(d1, id1, id2), arrange_ids(d2, id1, id2), by=c("id1", "id2"))
# Recursively, if you had more than two tibbles
list(d1, d2) %>%
  purrr::map(arrange_ids, id1, id2) %>%
  purrr::reduce(dplyr::inner_join, by=c("id1", "id2"))

[Package skater version 0.1.2 Index]