icd_parse {ICD10gm}R Documentation

Extract all ICD codes from a character vector

Description

An ICD code consists of, at a minimum, a three digit ICD-10 code (i.e. one upper-case letter followed by two digits). This may optionally be followed by a two digit subcode, selected punctuation symbols (cross "*", dagger "U2020" or exclamation mark "!"). Both the period separating the three-digit code from the subcode, and the hyphen indicating an "incomplete" subcode, are optional. Finally, in the ambulatory system, an additional letter G, V, Z or A may be appended to signify the status ("security") of the diagnosis.

Usage

icd_parse(str, type = "bounded", bind_rows = TRUE)

Arguments

str

Character vector from which to extract all ICD codes

type

A character string determining how strictly matching should be performed. This must be one of "strict" (str contains a ICD code with no extraneous characters), bounded (str contains an ICD code with a word boundary on both sides) or weak (ICD codes are extracted even if they are contained within a word, e.g. "E10Diabetes" would return "E10"). Default: bounded.

bind_rows

logical. Whether to convert the matrix output of stirngi::stri_match_all to a data.frame, with additional icd_sub to uniquely represent the code and allow lookup of the code

Details

By default, the function returns a data.frame containing the matched codes and the standardised three digit code (icd3), subcode (icd_subcode), normcode (icd_norm) and code without period (icd_sub).

If bind_rows = FALSE, the list output of stringi::stri_match_all_regex is returned. This is particularly useful to retrieve the matches from each element of the str vector separately.

Value

data.frame (if bind_rows = TRUE) or matrix

See Also

is_icd_code()

Examples

icd_parse("E11.7")
icd_parse("Depression: F32")
icd_parse(c(
  "Backpain (M54.9) is one of the most common diagnoses in primary care",
  "Codes for chronic pain include R52.1 and F45.4"
  ))

[Package ICD10gm version 1.2.5 Index]