checkBase {HospitalNetwork} | R Documentation |
General check function
Description
Function that performs various checks to ensure the database is correctly formatted, and adjusts overlapping patient records.
Usage
checkBase(
base,
convertDates = FALSE,
dateFormat = NULL,
deleteMissing = NULL,
deleteErrors = NULL,
subjectID = "sID",
facilityID = "fID",
disDate = "Ddate",
admDate = "Adate",
maxIteration = 25,
retainAuxData = TRUE,
verbose = TRUE,
...
)
Arguments
base |
(data.table). A patient discharge database, in the form of a data.table. The data.table should have at least the following columns: sID: patientID (character) fID: facilityID (character) Adate: admission date (POSIXct, but character can be converted to POSIXct) Ddate: discharge date (POSIXct, but character can be converted to POSIXct) |
convertDates |
(boolean) indicating if dates need to be converted to POSIXct if they are not |
dateFormat |
(character) giving the input format of the date character string (e.g. "ymd" for dates like "2019-10-30")
See |
deleteMissing |
(character) How to handle records that contain a missing value in at least one of the four mandatory variables: NULL (default): do not delete. Stops the function with an error message. "record": deletes just the incorrect record. "patient": deletes all records of each patient with one or more incorrect records. |
deleteErrors |
(character) How incorrect records should be deleted: "record" deletes just the incorrect record "patient" deletes all records of each patient with one or more incorrect records. |
subjectID |
(character) the columns name containing the subject ID. Default is "sID" |
facilityID |
(character) the columns name containing the facility ID. Default is "fID" |
disDate |
(character) the columns name containing the discharge date. Default is "Ddate" |
admDate |
(character) the columns name containing the admission date. Default is "Adate" |
maxIteration |
(integer) the maximum number of times the function will try and remove overlapping admissions |
retainAuxData |
(boolean) allow retaining additional data provided in the database. Default is TRUE. |
verbose |
(boolean) print diagnostic messages. Default is TRUE. |
... |
other parameters passed on to internal functions |
Value
The adjusted database as a data.table with a new class attribute "hospinet.base" and an attribute "report" containing information related to the quality of the database.
See Also
Examples
## create a "fake and custom" data base
mydb = create_fake_subjectDB(n_subjects = 100, n_facilities = 100)
setnames(mydb, 1:4, c("myPatientId", "myHealthCareCenterID", "DateOfAdmission", "DateOfDischarge"))
mydb[,DateOfAdmission:= as.character(DateOfAdmission)]
mydb[,DateOfDischarge:= as.character(DateOfDischarge)]
head(mydb)
# myPatientId myHealthCareCenterID DateOfAdmission DateOfDischarge
#1: s001 f078 2019-01-26 2019-02-01
#2: s002 f053 2019-01-18 2019-01-21
#3: s002 f049 2019-02-25 2019-03-05
#4: s002 f033 2019-04-17 2019-04-21
#5: s003 f045 2019-02-02 2019-02-04
#6: s003 f087 2019-03-12 2019-03-19
str(mydb)
#Classes ‘data.table’ and 'data.frame': 262 obs. of 4 variables:
# $ myPatientId : chr "s001" "s002" "s002" "s002" ...
# $ myHealthCareCenterID: chr "f078" "f053" "f049" "f033" ...
# $ DateOfAdmission : chr "2019-01-26" "2019-01-18" "2019-02-25" "2019-04-17" ...
# $ DateOfDischarge : chr "2019-02-01" "2019-01-21" "2019-03-05" "2019-04-21" ...
#- attr(*, ".internal.selfref")=<externalptr>
my_checked_db = checkBase(mydb,
subjectID = "myPatientId",
facilityID = "myHealthCareCenterID",
disDate = "DateOfDischarge",
admDate = "DateOfAdmission",
convertDates = TRUE,
dateFormat = "ymd")
#Converting Adate, Ddate to Date format
#Checking for missing values...
#Checking for duplicated records...
#Removed 0 duplicates
#Done.
head(my_checked_db)
# sID fID Adate Ddate
#1: s001 f078 2019-01-26 2019-02-01
#2: s002 f053 2019-01-18 2019-01-21
#3: s002 f049 2019-02-25 2019-03-05
#4: s002 f033 2019-04-17 2019-04-21
#5: s003 f045 2019-02-02 2019-02-04
#6: s003 f087 2019-03-12 2019-03-19
str(my_checked_db)
#Classes ‘hospinet.base’, ‘data.table’ and 'data.frame': 262 obs. of 4 variables:
#$ sID : chr "s001" "s002" "s002" "s002" ...
#$ fID : chr "f078" "f053" "f049" "f033" ...
#$ Adate: POSIXct, format: "2019-01-26" "2019-01-18" "2019-02-25" "2019-04-17" ...
#$ Ddate: POSIXct, format: "2019-02-01" "2019-01-21" "2019-03-05" "2019-04-21" ...
# ...
## Show the quality report
attr(my_checked_db, "report")