normalise {hospitals} | R Documentation |
Normalise hospital names
Description
normalise
tries to match provided hospital names to the Portuguese NHS
hospitals, i.e. to those hospitals included in the data set
hospitals
, thus allowing conversion to standard
hospital names. By default, it returns the shortened version of the hospital
name: column hospital_short_name
in hospitals
. Use
the return
argument to return a different variable, see below for possible
values.
Usage
normalise(
nm,
return = c("hospital_short_name", "hospital_full_name", "hospital_id",
"hospital_acronym"),
unmatched_as_na = TRUE
)
normalize(
nm,
return = c("hospital_short_name", "hospital_full_name", "hospital_id",
"hospital_acronym"),
unmatched_as_na = TRUE
)
Arguments
nm |
A character vector of hospital names. |
return |
A string indicating the hospital attribute to be returned:
either |
unmatched_as_na |
A logical indicating whether unmatched hospital names
are returned as |
Details
The method behind normalise
for matching hospital names is based on an
heuristic that uses a minimal set of keywords to identify the hospital. This
is implemented by using regular expressions. The regular expressions are
provided in data set hospitals
, column
hospital_regex
. Moreover, the method is case insensitive and is pretty
tolerant to variations in the name as long as one of the critical keywords is
found in the name. Note however that the regular expressions have been
designed such that matches are mutually exclusive. So the same hospital name
will never match more than one hospital of the data set
hospitals
.
normalise
is aware of deprecated hospital names, and will map those old
designations to the new hospital names, e.g., Hospital do Alto Ave is
correctly mapped to Hospital da Senhora da Oliveira, Guimarães, EPE.
normalise
is lenient with typos associated with accented characters, so,
e.g., both expressions 'Hospital de São João' and 'Hospital de Sao Joao' will
correctly match to the same hospital: CHU de São João.
Value
A character vector.
Examples
# Match hospital with a single keyword
normalise('Matosinhos')
# The same, but return now the full name
normalise('Matosinhos', 'hospital_full_name')
# Get instead the hospital identifier
normalise('Matosinhos', 'hospital_id')
# Or even just the acronym (useful for labelling in plots)
normalise('Matosinhos', 'hospital_acronym')
# Find hospitals from their old names
# "Hospital do Alto Ave" is the old name for 'Hospital da Senhora da Oliveira, Guimarães, EPE'
normalise('Hospital do Alto Ave', 'hospital_full_name')
# `normalise()` is vectorised over `nm`
normalise(nm = c('medio tejo', 'oeste', 'guarda'))