concatMatch {wrMisc} | R Documentation |
Value Matching With Option For Concatenated Terms
Description
This is a _match()_-like function allowing to serach among concatenated terms/IDs, additional options to remove text pattern like terminal lowercase extesion are available.
The function returns a named vector indicating the positions of (first) matches similar to match
.
Usage
concatMatch(
x,
table,
sep = ",",
sepPattern = NULL,
globalPat = "digitExtension",
nomatch = NA_integer_,
incomparables = NULL,
extensPat = TRUE,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
Arguments
x |
(vector) the values to be matched |
table |
(vector) the values to be matched against (ie reference) |
sep |
(character) separator character in case concatenation of entries is tested |
sepPattern |
(character or |
globalPat |
(character) pattern for additional trimming of serach-terms. If |
nomatch |
(vector) similar to |
incomparables |
(vector) similar to |
extensPat |
(logical) similar to |
silent |
(logical) suppress messages |
debug |
(logical) additional messages for debugging |
callFrom |
(character) allow easier tracking of messages produced |
Details
The main motivation to create this function was to be able to treat concatenated entries and to look if any
of the concatenated values match to 'x'.
This function offers additional options for trimming values before running the main comparison.
Of course, the concatenation strategy must be known and only a single concatenation separator (which may be multiple characters long) may be used for both x
and match
.
Thus result will only indicate that at least one of the concatenated terms had a match, but not which one.
Finally, both vectors x
and table
may contain concatenated terms.
In this case this function will require much more computational ressources due to the increased combinatorics when comparing larger vectors.
Please note, that in case of multiple to multiple matches, only the first hit gets reported.
The argument globalPat="digitExtension"
allows eg reducing 'A1234-4' to 'A1234'.
Value
This function returns a character vector with verified path and file-name(s), returns NULL
if nothing
See Also
match
(for two simple vectors without concatenated terms), grep
Examples
tab1 <- c("AA","BB-5","CCab","FF")
tab2 <- c("AA","WW,Vde,BB-5,E","CCab","FF,Uef")
x1 <- c("ZZ","YY","AA","BB-2","DD","CCdef","Dxy") # modif of single ID (no concat)
concatMatch(x1, tab2)
x2 <- c("ZZ,Z","YY,Y","AA,Z,Y","BB-2","DD","X,CCdef","Dxy") # conatenated in 'x'
concatMatch(x2, tab2)
tab1 <- c("AA","BB-5","CCab","FF") # no conatenated in 'table'
concatMatch(x2, tab1) # simple case of no concat anywhere
concatMatch(x1, tab1)