load_dem {parseRPDR} | R Documentation |
Loads demographic information into R for new demographic tables following changes in the beginning of 2022.
Description
Loads patient demographic and vital status information into the R environment. Since version 0.2.2 of the software this function supports the new demographics table data definitions.
Usage
load_dem(
file,
merge_id = "EMPI",
sep = ":",
id_length = "standard",
perc = 0.6,
na = TRUE,
identical = TRUE,
nThread = parallel::detectCores() - 1,
mrn_type = FALSE
)
Arguments
file |
string, full file path to Dem.txt. |
merge_id |
string, column name to use to create ID_MERGE column used to merge different datasets. Defaults to EPIC_PMRN, as it is the preferred MRN in the RPDR system. |
sep |
string, divider between hospital ID and MRN. Defaults to :. |
id_length |
string, indicating whether to modify MRN length based-on required values id_length = standard, or to keep lengths as is id_length = asis. If id_length = standard then in case of MGH, BWH, MCL, EMPI and PMRN the length of the MRNs are corrected accordingly by adding zeros, or removing numeral from the beginning. In other cases the lengths are unchanged. Defaults to standard. |
perc |
numeric, a number between 0-1 indicating which parsed ID columns to keep. Data present in perc x 100% of patients are kept. |
na |
boolean, whether to remove columns with only NA values. Defaults to TRUE. |
identical |
boolean, whether to remove columns with identical values. Defaults to TRUE. |
nThread |
integer, number of threads to use to load data. |
mrn_type |
boolean, should data in MRN_Type and MRN be parsed. Defaults to FALSE, as it is not advised to parse these for all data sources as it takes considerable time. |
Value
data table, with demographic information data.
- ID_MERGE
numeric, defined IDs by merge_id, used for merging later.
- ID_dem_EMPI
string, Unique Partners-wide identifier assigned to the patient used to consolidate patient information. from dem datasource, corresponds to EMPI in RPDR. Data is formatted using pretty_mrn().
- ID_dem_PMRN
string, Epic medical record number. This value is unique across Epic instances within the Partners network. from dem datasource, corresponds to EPIC_PMRN in RPDR. Data is formatted using pretty_mrn().
- ID_dem_loc
string, if mrn_type == TRUE, then the data in MRN_Type and MRN are parsed into IDs corresponding to locations (loc). Data is formatted using pretty_mrn().
- gender_legal_sex
string, Patient's legal sex, corresponds to Gender_Legal_Sex in RPDR.
- sex_at_birth
string, Patient’s sex at time of birth, corresponds to Sex_at_Birth in RPDR.
- gender_identity
string, Patient's personal conception of their gender, corresponds to Gender_Identity in RPDR.
- time_date_of_birth
POSIXct, Patient's date of birth, corresponds to Date_of_Birth. Converted to POSIXct format.
- age
string, Patient's current age (or age at death), corresponds to Age in RPDR.
- language
string, Patient's preferred spoken language, corresponds to Language in RPDR.
- language_group
string, Patient's preferred language: English or Non-English, corresponds to Language_Group in RPDR.
- race_1
string, Patient's primary race, corresponds to Race1 in RPDR.
- race_2
string, Patient's primary race if more than one race, corresponds to Race2 in RPDR.
- race_group
string, Patient's Race Group as determined by Race1 and Race2, corresponds to Race_Group in RPDR.
- ethnic_group
string, Patient's Ethnicity: Hispanic or Non Hispanic, corresponds to Ethnic_Group in RPDR.
- marital
string, Patient's current marital status, corresponds to Marital_Status in RPDR.
- religion
string, Patient-identified religious preference, corresponds to Religion in RPDR.
- veteran
string, Patient's current military veteran status, corresponds to Is_a_veteran in RPDR.
- country_dem
string, Patient's current country of residence from dem datasource, corresponds to Country in RPDR.
- zip_dem
string, Mailing zip code of patient's primary residence from dem datasource, corresponds to Zip_code in RPDR.Formatted to 5 character zip codes.
- vital_status
string, Identifies if the patient is living or deceased. This data is updated monthly from the Partners registration system and the Social Security Death Master Index, corresponds to Vital_Status in RPDR. Punctuation marks are removed.
- time_date_of_death
POSIXct, Recorded date of death from source in 'Vital_Status'. Date of death information obtained solely from the Social Security Death Index will not be reported until 3 years after death due to privacy concerns. If the value is independently documented by a Partners entity within the 3 year window then the date will be displayed. corresponds to Date_of_Death in RPDR. Converted to POSIXct format.
Examples
## Not run:
#Using defaults
d_dem <- load_dem(file = "test_Dem.txt")
#Use sequential processing
d_dem <- load_dem(file = "test_Dem.txt", nThread = 1)
#Use parallel processing and parse data in MRN_Type and MRN columns and keep all IDs
d_dem <- load_dem(file = "test_Dem.txt", nThread = 20, mrn_type = TRUE, perc = 1)
## End(Not run)