process_wide {dateutils} | R Documentation |
Process Wide Format Data
Description
Process data in wide format for time series modeling
Usage
process_wide(
dt_wide,
lib,
detrend = TRUE,
center = TRUE,
scale = TRUE,
date_name = "ref_date",
ignore_numeric_names = TRUE,
silent = FALSE
)
Arguments
dt_wide |
Data in wide format. |
lib |
Library with instructions regarding how to process data; see details. |
detrend |
T/F should data be detrended (see details)? |
center |
T/F should data be centered (i.e. de-meaned)? |
scale |
T/F should data be scaled (i.e. variance 1)? |
date_name |
Name of data column in the data. |
ignore_numeric_names |
T/F ignore numeric values in matching series names in 'dt' to series names in 'lib'. This is required for data aggregated using 'process_MF()', as lags of LHS and RHS data are tagged 0 for contemporaneous data, 1 for one lag, 2 for 2 lags, etc. Ignoring these tags insures processing from 'lib' is correctly identified. |
silent |
T/F, supress warnings? |
Details
'process_wide()' can be used to transform wide data to insure stationarity. Censoring by pub_date requires long format. Directions for processing each file come from the data.table 'lib'. This table must include the columns 'series_name', 'take_logs', and 'take_diffs'. Unique series may also be identified by a combination of 'country' and 'series_name'. Optional columns include 'needs_SA' for series that need seasonal adjustment, 'detrend' for removing low frequency trends (nowcasting only; 'detrend' should not be used for long horizon forecasts), 'center' to de-mean the data, and 'scale' to scale the data. If the argument to 'process_wide()' of 'detrend', 'center', or 'scale' is 'FALSE', the operation will not be performed. If 'TRUE', the function will check for the column of the same name in 'lib'. If the column exists, T/F entries from this column are used to determine which series to transform. If the column does not exist, all series will be transformed.
Value
data.table of processed data
Examples
LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dtQ <- process_MF(LHS, RHS)
dt_wide <- data.table::dcast(dtQ, ref_date ~ series_name, value.var = "value")
dt_processed <- process_wide(dt_wide, fredlib)