process_data {bp} | R Documentation |
Data Pre-Processor
Description
A helper function to assist in pre-processing the user-supplied
input data in a standardized format for use with other functions in the bp
package.
See Vignette for further details.
Usage
process_data(
data,
bp_type = c("hbpm", "abpm", "ap"),
ap = NULL,
time_elap = NULL,
sbp = NULL,
dbp = NULL,
date_time = NULL,
id = NULL,
group = NULL,
wake = NULL,
visit = NULL,
hr = NULL,
pp = NULL,
map = NULL,
rpp = NULL,
DoW = NULL,
ToD_int = NULL,
eod = NULL,
data_screen = TRUE,
SUL = 240,
SLL = 50,
DUL = 140,
DLL = 40,
HRUL = 220,
HRLL = 27,
inc_low = TRUE,
inc_crisis = TRUE,
agg = FALSE,
agg_thresh = 3,
collapse_df = FALSE,
dt_fmt = "ymd HMS",
chron_order = FALSE,
tz = "UTC"
)
Arguments
data |
User-supplied dataset containing blood pressure data. Must contain data for Systolic blood pressure and Diastolic blood pressure at a minimum. |
bp_type |
Required argument specifying which of the three BP data types
("HBPM", "ABPM", or "AP") the input data is. Default |
ap |
(For AP data only) Required column name (character string) corresponding to continuous Arterial Pressure (AP) (mmHg). Note that this is a required argument so long as bp_type = "AP". Ensure that bp_type is set accordingly. |
time_elap |
(For AP data only) Column name corresponding to the time elapsed for the given AP waveform data. |
sbp |
Required column name (character string) corresponding to Systolic Blood Pressure (mmHg) |
dbp |
Required column name (character string) corresponding to Diastolic Blood Pressure (mmHg) |
date_time |
Optional column name (character string) corresponding to Date/Time, but HIGHLY recommended to supply if available. For DATE-only columns (with no associated time), leave date_time = NULL. DATE-only adjustments are automatic. Dates can be automatically calculated off DATE_TIME column provided that it is called "DATE_TIME" exactly. |
id |
Optional column name (character string) corresponding to subject ID. Typically needed for data corresponding to more than one subject. For one-subject datasets, ID will default to 1 (if ID column not found in dataset) |
group |
Optional column name (character string) corresponding to an additional grouping variable that can be used to further break down data. NOTE that this simply sets the column as "GROUP" so that other functions recognize which column to use as the grouping variable. |
wake |
Optional column name (character string) corresponding to sleep status. A WAKE value of 1 indicates that the subject is awake and 0 implies asleep. |
visit |
Optional column name (character string) corresponding to Visit number |
hr |
Optional column name (character string) corresponding to Heart Rate (bpm) |
pp |
Optional column name (character string) corresponding to Pulse Pressure (SBP - DBP). If not supplied, it will be calculated automatically. |
map |
Optional column name (character string) corresponding to Mean Arterial Pressure |
rpp |
Optional column name (character string) corresponding to Rate Pulse Pressure (SBP * HR). If not supplied, but HR column available, then RPP will be calculated automatically. |
DoW |
Optional column name (character string) corresponding to the Day of the Week.
If not supplied, but DATE or DATE_TIME columns available, then DoW will be created
automatically. DoW values must be abbreviated as such |
ToD_int |
Optional vector of length 4, acceptable values are from 0 to 23 in a an order corresponding to hour for Morning, Afternoon, Evening, Night). This vector allows to override the default interval for the Time-of-Day periods: if NULL, the Morning, Afternoon, Evening, and Night periods are set at 6, 12, 18, 0 respectively, where 0 corresponds to the 24th hour of the day (i.e. Midnight). For example, ToD_int = c(5, 13, 18, 23) would correspond to a period for Morning starting at 5:00 (until 13:00), Afternoon starting at 13:00 (until 18:00), Evening starting at 18:00 (until 23:00), and Night starting at 23:00 (until 5:00) |
eod |
Optional argument to adjust the delineation for the end of day (eod). The supplied value should be a character string with 4 characters representing the digits of 24-hour time, e.g. "1310" corresponds to 1:10pm. For individuals who
do not go to bed early or work night-shifts, this argument adjusts the |
data_screen |
Optional logical argument; default set to TRUE. Screens for extreme values in the data
for both |
SUL |
Systolic Upper Limit (SUL). If |
SLL |
Systolic Lower Limit (SLL). If |
DUL |
Diastolic Upper Limit (DUL). If |
DLL |
Diastolic Lower Limit (DLL). If |
HRUL |
Heart Rate Upper Limit (HRUL). If see https://www.cdc.gov/physicalactivity/basics/measuring/heartrate.htm |
HRLL |
Heart Rate Upper Limit (HRUL). If |
inc_low |
Optional logical argument dictating whether or not to include the "Low" category for BP classification column (and the supplementary SBP/DBP Category columns). Default set to TRUE. |
inc_crisis |
Optional logical argument dictating whether or not to include the "Crisis" category for BP classification column (and the supplementary SBP/DBP Category columns). Default set to TRUE. |
agg |
Optional argument specifying whether or not to aggregate the data based on the amount of time
between observations. If |
agg_thresh |
Optional argument specifying the threshold of how many minutes can pass between readings (observations) and still be considered part of the same sitting. The default is set to 3 minutes. This implies that if two or more readings are within 3 minutes of each other, they will be averaged together (if agg is set to TRUE). |
collapse_df |
Optional argument that collapses the dataframe to eliminate repeating rows after aggregation. |
dt_fmt |
Optional argument that specifies the input date/time format (dt_fmt). Default set to "ymd HMS" but can take on any format specified by the lubridate package. |
chron_order |
Optional argument that specifies whether to order the data in chronological (Oldest dates & times at the top / first) or reverse chronological order (Most recent dates & times at the top / first). TRUE refers to chronological order; FALSE refers to reverse chronological order. The default is set to FALSE (i.e. most recent observations listed first in the dataframe). See https://lubridate.tidyverse.org/reference/parse_date_time.html for more details. |
tz |
Optional argument denoting the respective time zone. Default time zone set to "UTC". See
Use |
Value
A processed dataframe object with standardized column names and formats to use with the rest of bp package functions. The following standardized column names are used throughout
BP_TYPE |
One of AP, HBPM or ABPM |
ID |
Subject ID |
SBP |
Systolic Blood Pressure |
DBP |
Diastolic Blood Pressure |
SBP_CATEGORY |
Ordinal, SBP characterization into "Low" < "Normal"<"Elevated"<"Stage 1"< "Stage 2" < "Crisis". "Low" is not included if |
DBP_CATEGORY |
Ordinal, DBP characterization into "Low" < "Normal"<"Elevated"<"Stage 1"< "Stage 2" < "Crisis". "Low" is not included if |
BP_CLASS |
Blood pressure categorization based on paired values (SBP, DBP) into one of the 8 stages according to Lee et al. 2020. See |
HR |
Heart Rate |
MAP |
Mean Arterial Pressure |
PP |
Pulse Pressure, SBP-DBP |
DATE_TIME |
Date and time in POSIXct format |
DATE |
Date only in Date format |
MONTH |
Month, integer from 1 to 12 |
DAY |
Day, integer from 1 to 31 |
YEAR |
Four digit year |
DAY_OF_WEEK |
Ordinal, with "Sun"<"Mon"<"Tue"<"Wed"<"Thu"<"Fri"<"Sat" |
TIME |
Time in character format |
HOUR |
Integer, from 0 to 23 |
TIME_OF_DAY |
One of "Morning", "Afternoon", "Evening" or "Night" |
References
Lee H, Yano Y, Cho SMJ, Park JH, Park S, Lloyd-Jones DM, Kim HC. Cardiovascular risk of isolated systolic or diastolic hypertension in young adults. Circulation. 2020; 141:1778–1786. doi: 10.1161/CIRCULATIONAHA.119.044838
Omboni, S., Parati, G*., Zanchetti, A., Mancia, G. Calculation of trough: peak ratio of antihypertensive treatment from ambulatory blood pressure: methodological aspects Journal of Hypertension. October 1995 - Volume 13 - Issue 10 - p 1105-1112 doi: 10.1097/00004872-199510000-00005
Unger, T., Borghi, C., Charchar, F., Khan, N. A., Poulter, N. R., Prabhakaran, D., ... & Schutte, A. E. (2020). 2020 International Society of Hypertension global hypertension practice guidelines. Hypertension, 75(6), 1334-1357. doi: 10.1161/HYPERTENSIONAHA.120.15026
Examples
# Load bp_hypnos
data("bp_hypnos")
# Process data for bp_hypnos
hypnos_proc <- process_data(bp_hypnos,
bp_type = 'abpm',
sbp = 'syst',
dbp = 'diast',
date_time = 'date.time',
hr = 'hr',
pp = 'PP',
map = 'MaP',
rpp = 'Rpp',
id = 'id',
visit = 'Visit',
wake = 'wake',
data_screen = FALSE)
hypnos_proc
# Load bp_jhs data
data("bp_jhs")
# Process data for bp_jhs
# Note that bp_type defaults to "hbpm" and is therefore not specified
jhs_proc <- process_data(bp_jhs,
sbp = "Sys.mmHg.",
dbp = "Dias.mmHg.",
date_time = "DateTime",
hr = "Pulse.bpm.")
jhs_proc