process {dateutils}R Documentation

Process Data

Description

Process data to ensure stationarity in long format for time series modeling

Usage

process(
  dt,
  lib,
  detrend = TRUE,
  center = TRUE,
  scale = TRUE,
  as_of = NULL,
  date_name = "ref_date",
  id_name = "series_name",
  value_name = "value",
  pub_date_name = NULL,
  ignore_numeric_names = TRUE,
  silent = FALSE
)

Arguments

dt

Data in long format.

lib

Library with instructions regarding how to process data; see details.

detrend

T/F should data be detrended (see details)?

center

T/F should data be centered (i.e. de-meaned)?

scale

T/F should data be scaled (i.e. variance 1)?

as_of

"As of" date at which to censor observations for backesting. This requires 'pub_date_name' is specified.

date_name

Name of data column in the data.

id_name

Name of ID column in the data.

value_name

Name of value column in the data

pub_date_name

Name of publication date column in the data; required if 'as_of' specified.

ignore_numeric_names

T/F ignore numeric values in matching series names in 'dt' to series names in 'lib'. This is required for data aggregated using 'process_MF()', as lags of LHS and RHS data are tagged 0 for contemporaneous data, 1 for one lag, 2 for 2 lags, etc. Ignoring these tags insures processing from 'lib' is correctly identified.

silent

T/F, supress warnings?

Details

Process data can be used to transform data to insure stationarity and to censor data for backtesting. Directions for processing each file come from the data.table 'lib'. This table must include the columns 'series_name', 'take_logs', and 'take_diffs'. Unique series may also be identified by a combination of 'country' and 'series_name'. Optional columns include 'needs_SA' for series that need seasonal adjustment, 'detrend' for removing low frequency trends (nowcasting only; detrend should not be used for long horizon forecasts), 'center' to de-mean the data, and 'scale' to scale the data. If the argument to 'process_wide()' of 'detrend', 'center', or 'scale' is 'FALSE', the operation will not be performed. If 'TRUE', the function will check for the column of the same name in 'lib'. If the column exists, T/F entries from this column are used to determine which series to transform. If the column does not exist, all series will be transformed.

Value

data.table of processed values in long format.

Examples

dt <- process(fred, fredlib)

LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dtQ <- process_MF(LHS, RHS)
dt_processed <- process(dtQ, fredlib)

[Package dateutils version 0.1.5 Index]