extract_entities {medExtractR} | R Documentation |
Extract Medication Entities From Phrase
Description
This function searches a phrase for medication dosing entities of interest. It
is called within medExtractR
and generally not intended for use outside
that function. The phrase
argument containing text to search corresponds to an
individual mention of the drug of interest.
Usage
extract_entities(
phrase,
p_start,
p_stop,
unit,
frequency_fun = NULL,
intaketime_fun = NULL,
duration_fun = NULL,
route_fun = NULL,
strength_sep = NULL,
...
)
Arguments
phrase |
Text to search. |
p_start |
Start position of phrase within original text. |
p_stop |
End position of phrase within original text. |
unit |
Unit of measurement for medication strength, e.g. ‘mg’. |
frequency_fun |
Function used to extract frequency. |
intaketime_fun |
Function used to extract intake time. |
duration_fun |
Function used to extract duration. |
route_fun |
Function used to extract route. |
strength_sep |
Delimiter for contiguous medication strengths. |
... |
Parameter settings used in extracting frequency and intake time,
including additional arguments to the |
Details
Various medication dosing entities are extracted within this function including the following:
strength: The amount of drug in a given dosage form (i.e., tablet, capsule).
dose amount: The number of tablets, capsules, etc. taken at a given intake time.
dose strength: The total amount of drug given intake. This quantity would be
equivalent to strength x dose amount, and appears similar to strength when
dose amount is absent.
frequency: The number of times per day a dose is taken, e.g.,
“once daily” or ‘2x/day’.
intaketime: The time period of the day during which a dose is taken,
e.g., ‘morning’, ‘lunch’, ‘in the pm’.
duration: How long a patient is on a drug regimen, e.g., ‘2 weeks’,
‘mid-April’, ‘another 3 days’.
route: The administration route of the drug, e.g., ‘by mouth’,
‘IV’, ‘topical’.
Note that extraction of the entities drug name, dose change, and time of last dose are not
handled by the extract_entities
function. Those entities are extracted separately
and appended to the extract_entities
output within the main medExtractR
function.
Strength, dose amount, and dose strength are primarily numeric quantities, and are identified
using a combination of regular expressions and rule-based approaches. Frequency, intake time,
route, and duration, on the other hand, use dictionaries for identification.
By default and when an argument <entity>_fun
is NULL
, the
extract_generic
function will be used to extract that entity. This function
can also inherit user-defined entity dictionaries, supplied as arguments <entity>_dict
to medExtractR
or medExtractR_tapering
(see documentation files for main function(s) for details).
The stength_sep
argument is NULL
by default, but can be used to
identify shorthand for morning and evening doses. For example, consider the
phrase “Lamotrigine 300-200” (meaning 300 mg in the morning and 200 mg
in the evening). The argument strength_sep = '-'
identifies
the full expression 300-200 as dose strength in this phrase.
Value
data.frame with entities information. At least one row per entity is returned,
using NA
when no expression was found for a given entity.
The “entity” column of the output contains the formatted label for that entity, according to
the following mapping.
strength: “Strength”
dose amount: “DoseAmt”
dose strength: “DoseStrength”
frequency: “Frequency”
intake time: “IntakeTime”
duration: “Duration”
route: “Route”
Sample output for the phrase “Lamotrigine 200mg bid” would look like:
entity | expr |
IntakeTime | <NA> |
Strength | <NA> |
DoseAmt | <NA> |
Route | <NA> |
Duration | <NA> |
Frequency | bid;19:22 |
DoseStrength | 200mg;13:18 |
Examples
note <- "Lamotrigine 25 mg tablet - 3 tablets oral twice daily"
extract_entities(note, 1, nchar(note), "mg")
# A user-defined dictionary can be used instead of the default
my_dictionary <- data.frame(c("daily", "twice daily"))
extract_entities(note, 1, 53, "mg", frequency_dict = my_dictionary)