icd_parse {ICD10gm} | R Documentation |
Extract all ICD codes from a character vector
Description
An ICD code consists of, at a minimum, a three digit ICD-10 code (i.e. one upper-case letter followed by two digits). This may optionally be followed by a two digit subcode, selected punctuation symbols (cross "*", dagger "U2020" or exclamation mark "!"). Both the period separating the three-digit code from the subcode, and the hyphen indicating an "incomplete" subcode, are optional. Finally, in the ambulatory system, an additional letter G, V, Z or A may be appended to signify the status ("security") of the diagnosis.
Usage
icd_parse(str, type = "bounded", bind_rows = TRUE)
Arguments
str |
Character vector from which to extract all ICD codes |
type |
A character string determining how strictly matching should be performed. This must be one of "strict" ( |
bind_rows |
logical. Whether to convert the matrix output of |
Details
By default, the function returns a data.frame
containing the matched codes and the standardised
three digit code (icd3
), subcode (icd_subcode
),
normcode (icd_norm
) and code without period (icd_sub
).
If bind_rows = FALSE
, the list output of
stringi::stri_match_all_regex
is returned.
This is particularly useful to retrieve the
matches from each element of the str
vector
separately.
Value
data.frame (if bind_rows = TRUE) or matrix
See Also
Examples
icd_parse("E11.7")
icd_parse("Depression: F32")
icd_parse(c(
"Backpain (M54.9) is one of the most common diagnoses in primary care",
"Codes for chronic pain include R52.1 and F45.4"
))