uk_patient_id {epidm} | R Documentation |
Patient ID record grouping
Description
Groups patient records from multiple isolates with a single integer patientID by grouping patient identifiers.
Grouping is based on five stages:
matching nhs number and date of birth
Hospital number & Date of Birth
NHS number & Hospital Number
Date of Birth & Surname IF nhs unknown
Sex & Date of Birth & Fuzzy Name
Identifiers are copied over where they are missing or invalid to the grouped records.
Usage
uk_patient_id(
x,
nhs_number,
hospital_number,
date_of_birth,
sex_mfu,
forename = "NONAME",
surname = "NONAME",
.sortOrder,
.keepValidNHS = FALSE,
.forceCopy = FALSE,
.experimental = FALSE
)
Arguments
x |
a data.frame or data.table containing the cleaned line list |
nhs_number |
a column as a character containing the patient NHS numbers |
hospital_number |
a column as a character containing the patient Hospital numbers |
date_of_birth |
a column as a date variable containing the patient date of birth in date format |
sex_mfu |
column as a character containing the patient sex;
NOTE only works if coded only as character versions of Male/Female/Unknown;
does not currently work with additional options |
forename |
a column as a character containing the patient forename; leave as NONAME if unavailable |
surname |
a column as a character containing the patient surname; leave as NONAME if unavailable |
.sortOrder |
optional; a column as a character to allow a sorting order on the id generation |
.keepValidNHS |
optional, default FALSE; set TRUE if you wish to retain the column with the NHS checksum result stored as a BOOLEAN |
.forceCopy |
optional, default FALSE; TRUE will force data.table to take a copy instead of editing the data without reference |
.experimental |
optional, default FALSE; TRUE will enable the experimental features for recoding NA values based on the mode |
Value
A dataframe with one new variable:
id
a unique patient id
valid_nhs
if retained using argument
.keepValidNHS=TRUE
, a BOOLEAN containing the result of the NHS checksum validation
Examples
id_test <- data.frame(
nhs_n = c(
9434765919,9434765919,9434765919,NA,NA,
3367170666,5185293519,5185293519,5185293519,8082318562,NA,NA,NA
),
hosp_n = c(
'13','13','13','UNKNOWN','13','13','13','31','31','96','96',NA,'96'),
sex = c(rep('F',6),rep('Male',4), 'U', 'U', 'M'),
dateofbirth = as.Date(
c(
'1988-10-06','1988-10-06','1900-01-01','1988-10-06','1988-10-06',
'1988-10-06','1988-10-06','1988-10-06','1988-10-06','2020-01-28',
'2020-01-28','2020-01-28','2020-01-28'
)
),
firstname = c(
'Danger','Danger','Denger','Danger','Danger','DANGER','Danger',
'Danger','Danger','Crazy','Crazy','Krazy','C'
),
lastname = c(
'Mouse','Mause','Mouse','Moose','Moose','Mouse','MOUSe',
'Mouse','Mouse','Frog','FROG','Frug','Frog'
),
testdate = sample(seq.Date(Sys.Date()-21,Sys.Date(),"day"),13,replace = TRUE)
)
uk_patient_id(x = id_test,
nhs_number = 'nhs_n',
hospital_number = 'hosp_n',
forename = 'firstname',
surname = 'lastname',
sex_mfu = 'sex',
date_of_birth = 'dateofbirth',
.sortOrder = 'testdate')[]