process_wide {dateutils}R Documentation

Process Wide Format Data

Description

Process data in wide format for time series modeling

Usage

process_wide(
  dt_wide,
  lib,
  detrend = TRUE,
  center = TRUE,
  scale = TRUE,
  date_name = "ref_date",
  ignore_numeric_names = TRUE,
  silent = FALSE
)

Arguments

dt_wide

Data in wide format.

lib

Library with instructions regarding how to process data; see details.

detrend

T/F should data be detrended (see details)?

center

T/F should data be centered (i.e. de-meaned)?

scale

T/F should data be scaled (i.e. variance 1)?

date_name

Name of data column in the data.

ignore_numeric_names

T/F ignore numeric values in matching series names in 'dt' to series names in 'lib'. This is required for data aggregated using 'process_MF()', as lags of LHS and RHS data are tagged 0 for contemporaneous data, 1 for one lag, 2 for 2 lags, etc. Ignoring these tags insures processing from 'lib' is correctly identified.

silent

T/F, supress warnings?

Details

'process_wide()' can be used to transform wide data to insure stationarity. Censoring by pub_date requires long format. Directions for processing each file come from the data.table 'lib'. This table must include the columns 'series_name', 'take_logs', and 'take_diffs'. Unique series may also be identified by a combination of 'country' and 'series_name'. Optional columns include 'needs_SA' for series that need seasonal adjustment, 'detrend' for removing low frequency trends (nowcasting only; 'detrend' should not be used for long horizon forecasts), 'center' to de-mean the data, and 'scale' to scale the data. If the argument to 'process_wide()' of 'detrend', 'center', or 'scale' is 'FALSE', the operation will not be performed. If 'TRUE', the function will check for the column of the same name in 'lib'. If the column exists, T/F entries from this column are used to determine which series to transform. If the column does not exist, all series will be transformed.

Value

data.table of processed data

Examples

LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dtQ <- process_MF(LHS, RHS)
dt_wide <- data.table::dcast(dtQ, ref_date ~ series_name, value.var = "value")
dt_processed <- process_wide(dt_wide, fredlib)

[Package dateutils version 0.1.5 Index]