R: Extract all ICD codes from a character vector

icd_parse {ICD10gm}

R Documentation

Extract all ICD codes from a character vector

Description

An ICD code consists of, at a minimum, a three digit ICD-10 code (i.e. one upper-case letter followed by two digits). This may optionally be followed by a two digit subcode, selected punctuation symbols (cross "*", dagger "U2020" or exclamation mark "!"). Both the period separating the three-digit code from the subcode, and the hyphen indicating an "incomplete" subcode, are optional. Finally, in the ambulatory system, an additional letter G, V, Z or A may be appended to signify the status ("security") of the diagnosis.

Usage

icd_parse(str, type = "bounded", bind_rows = TRUE)

Arguments

`str`	Character vector from which to extract all ICD codes
`type`	A character string determining how strictly matching should be performed. This must be one of "strict" (`str` contains a ICD code with no extraneous characters), `bounded` (`str` contains an ICD code with a word boundary on both sides) or `weak` (ICD codes are extracted even if they are contained within a word, e.g. "E10Diabetes" would return "E10"). Default: `bounded`.
`bind_rows`	logical. Whether to convert the matrix output of `stirngi::stri_match_all` to a data.frame, with additional `icd_sub` to uniquely represent the code and allow lookup of the code

Details

By default, the function returns a data.frame containing the matched codes and the standardised three digit code (icd3), subcode (icd_subcode), normcode (icd_norm) and code without period (icd_sub).

If bind_rows = FALSE, the list output of stringi::stri_match_all_regex is returned. This is particularly useful to retrieve the matches from each element of the str vector separately.

Value

data.frame (if bind_rows = TRUE) or matrix

Examples

icd_parse("E11.7")
icd_parse("Depression: F32")
icd_parse(c(
  "Backpain (M54.9) is one of the most common diagnoses in primary care",
  "Codes for chronic pain include R52.1 and F45.4"
  ))

[Package ICD10gm version 1.2.5 Index]