load_dia {parseRPDR}R Documentation

Loads diagnoses into R.

Description

Loads diagnoses information into the R environment, both Dia and Dea files.

Usage

load_dia(
  file,
  merge_id = "EMPI",
  sep = ":",
  id_length = "standard",
  perc = 0.6,
  na = TRUE,
  identical = TRUE,
  nThread = parallel::detectCores() - 1,
  mrn_type = FALSE
)

Arguments

file

string, full file path to Dia.txt or Dea.txt.

merge_id

string, column name to use to create ID_MERGE column used to merge different datasets. Defaults to EPIC_PMRN, as it is the preferred MRN in the RPDR system.

sep

string, divider between hospital ID and MRN. Defaults to :.

id_length

string, indicating whether to modify MRN length based-on required values id_length = standard, or to keep lengths as is id_length = asis. If id_length = standard then in case of MGH, BWH, MCL, EMPI and PMRN the length of the MRNs are corrected accordingly by adding zeros, or removing numeral from the beginning. In other cases the lengths are unchanged. Defaults to standard.

perc

numeric, a number between 0-1 indicating which parsed ID columns to keep. Data present in perc x 100% of patients are kept.

na

boolean, whether to remove columns with only NA values. Defaults to TRUE.

identical

boolean, whether to remove columns with identical values. Defaults to TRUE.

nThread

integer, number of threads to use to load data.

mrn_type

boolean, should data in MRN_Type and MRN be parsed. Defaults to FALSE, as it is not advised to parse these for all data sources as it takes considerable time.

Value

data table, with diagnoses information.

ID_MERGE

numeric, defined IDs by merge_id, used for merging later.

ID_dia_EMPI

string, Unique Partners-wide identifier assigned to the patient used to consolidate patient information from dia datasource, corresponds to EMPI in RPDR. Data is formatted using pretty_mrn().

ID_dia_PMRN

string, Epic medical record number. This value is unique across Epic instances within the Partners network from dia datasource, corresponds to EPIC_PMRN in RPDR. Data is formatted using pretty_mrn().

ID_dia_loc

string, if mrn_type == TRUE, then the data in MRN_Type and MRN are parsed into IDs corresponding to locations (loc). Data is formatted using pretty_mrn().

time_dia

POSIXct, Date when the diagnosis was noted, corresponds to Date in RPDR. Converted to POSIXct format.

dia_name

string, Name of the diagnosis, diagnosis-related group, or phenotype. For more information on available Phenotypes visit https://phenotypes.partners.org/phenotype_list.html, corresponds to Diagnosis_Name in RPDR.

dia_code

string, Diagnosis, diagnosis-related group, or phenotype code, corresponds to Code in RPDR.

dia_code_type

string, Standardized classification system or custom grouping associated with the diagnosis code, corresponds to Code_type in RPDR.

dia_flag

string, Qualifier for the diagnosis, if any, corresponds to Diagnosis_flag in RPDR.

dia_enc_num

string, Unique identifier of the record/visit. This values includes the source system, hospital, and a unique identifier within the source system, corresponds to Encounter_number in RPDR.

dia_provider

string, Provider of record for the encounter where the diagnosis was entered, corresponds to Provider in RPDR.

dia_clinic

string, Specific department/location where the patient encounter took place, corresponds to Clinic in RPDR.

dia_hosp

string, Facility where the encounter occurred, corresponds to Hospital in RPDR.

dia_inpatient

string, Identifies whether the diagnosis was noted during an inpatient or outpatient encounter, corresponds to Inpatient_Outpatient in RPDR. Punctuation marks removed.

Examples

## Not run: 
#Using defaults
d_dia <- load_dia(file = "test_Dia.txt")

#Use sequential processing
d_dia <- load_dia(file = "test_Dia.txt", nThread = 1)

#Use parallel processing and parse data in MRN_Type and MRN columns and keep all IDs
d_dea <- load_dia(file = "test_Dea.txt", nThread = 20, mrn_type = TRUE, perc = 1)

## End(Not run)

[Package parseRPDR version 1.1.1 Index]