supCount {PTXQC}R Documentation

Compute shortest prefix length which makes all strings in a vector uniquely identifyable.

Description

If there is no unique prefix (e.g. if a string is contained twice), then the length of the longest string is returned, i.e. if the return value is used in a call to substr, nothing happens e.g. substr(x, 1, supCount(x)) == x

Usage

supCount(x, prefix_l = 1)

Arguments

x

Vector of strings

prefix_l

Starting prefix length, which is incremented in steps of 1 until all prefixes are unique (or maximum string length is reached)

Value

Integer with minimal prefix length required

Examples

  supCount(c("abcde...", "abcd...", "abc..."))  ## 5

  x = c("doubled", "doubled", "aLongDummyString")
  all( substr(x, 1, supCount(x)) == x )   
  ## TRUE (no unique prefix due to duplicated entries)


[Package PTXQC version 1.1.1 Index]