network {RepertoiR}R Documentation

Sequences distance network

Description

Computes pairwise string distances among repertoire's sequences and visualize similar pairs as connected nodes, each sized by its frequency.

Usage

network(dataset, by, nrow, method, ...)

Arguments

dataset

A matrix or a data frame includes row names which are used as the compared sequences. Data set's numeric values determine node-size.

by

Index of column to set its values as node-size. first column is default (1).

nrow

Number of nodes to display. Default is 1000 nodes.

method

stringdist method to perform for distance dissimilarity calculation: "osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex". Default is Levenshtein distance ("lv").

...

Any additional arguments needed by the specialized methods.

Value

No return value.

Examples


aa <- c(
  "G", "A", "V", "L", "I", "P", "F", "Y", "W", "S",
  "T", "N", "Q", "C", "M", "D", "E", "H", "K", "R"
)
data <- matrix(rexp(1 / 2, n = 1000), ncol = 4)
cons <- sample(aa, 10)
aavec <- c()

while (length(aavec) < nrow(data)) {
  aaseq <- cons
  index <- sample(length(aaseq), sample(length(aaseq) / 3, 1))
  aaseq[index] <- sample(aa, length(index), replace = TRUE)
  aaseq <- paste0(aaseq, collapse = "")
  aavec <- unique(append(aavec, aaseq))
}

rownames(data) <- aavec
colnames(data) <- LETTERS[1:ncol(data)]

network(data, by = 3, nrow = 100)

[Package RepertoiR version 0.0.1 Index]