concatMatch {wrMisc}R Documentation

Value Matching With Option For Concatenated Terms


This is a _match()_-like function allowing to serach among concatenated terms/IDs, additional options to remove text pattern like terminal lowercase extesion are available. The function returns a named vector indicating the positions of (first) matches similar to match.


  sep = ",",
  sepPattern = NULL,
  globalPat = "digitExtension",
  nomatch = NA_integer_,
  incomparables = NULL,
  extensPat = TRUE,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL



(vector) the values to be matched


(vector) the values to be matched against (ie reference)


(character) separator character in case concatenation of entries is tested


(character or NULL) optional custom pattern for splitting concatenations of x) and table) (in case NULL) is not sufficient)


(character) pattern for additional trimming of serach-terms. If globalPat="digitExtension" all terminal digits will not be considered when matching


(vector) similar to match the value to be returned in the case when no match is found


(vector) similar to match, a vector of values that cannot be matched. Any value in x matching a value in this vector is assigned the nomatch value.


(logical) similar to match the value to be returned in the case when no match is found


(logical) suppress messages


(logical) additional messages for debugging


(character) allow easier tracking of messages produced


The main motivation to create this function was to be able to treat concatenated entries and to look if any of the concatenated values match to 'x'. This function offers additional options for trimming values before running the main comparison.

Of course, the concatenation strategy must be known and only a single concatenation separator (which may be multiple characters long) may be used for both x and match. Thus result will only indicate that at least one of the concatenated terms had a match, but not which one. Finally, both vectors x and table may contain concatenated terms. In this case this function will require much more computational ressources due to the increased combinatorics when comparing larger vectors.

Please note, that in case of multiple to multiple matches, only the first hit gets reported.

The argument globalPat="digitExtension" allows eg reducing 'A1234-4' to 'A1234'.


This function returns a character vector with verified path and file-name(s), returns NULL if nothing

See Also

match (for two simple vectors without concatenated terms), grep


tab1 <- c("AA","BB-5","CCab","FF")
tab2 <- c("AA","WW,Vde,BB-5,E","CCab","FF,Uef")
x1 <- c("ZZ","YY","AA","BB-2","DD","CCdef","Dxy")            # modif of single ID (no concat)
concatMatch(x1, tab2)
x2 <- c("ZZ,Z","YY,Y","AA,Z,Y","BB-2","DD","X,CCdef","Dxy")  # conatenated in 'x'
concatMatch(x2, tab2)
tab1 <- c("AA","BB-5","CCab","FF")              # no conatenated in 'table'
concatMatch(x2, tab1)                          # simple case of no concat anywhere
concatMatch(x1, tab1)

[Package wrMisc version 1.15.1 Index]