process {dateutils} | R Documentation |
Process Data
Description
Process data to ensure stationarity in long format for time series modeling
Usage
process(
dt,
lib,
detrend = TRUE,
center = TRUE,
scale = TRUE,
as_of = NULL,
date_name = "ref_date",
id_name = "series_name",
value_name = "value",
pub_date_name = NULL,
ignore_numeric_names = TRUE,
silent = FALSE
)
Arguments
dt |
Data in long format. |
lib |
Library with instructions regarding how to process data; see details. |
detrend |
T/F should data be detrended (see details)? |
center |
T/F should data be centered (i.e. de-meaned)? |
scale |
T/F should data be scaled (i.e. variance 1)? |
as_of |
"As of" date at which to censor observations for backesting. This requires 'pub_date_name' is specified. |
date_name |
Name of data column in the data. |
id_name |
Name of ID column in the data. |
value_name |
Name of value column in the data |
pub_date_name |
Name of publication date column in the data; required if 'as_of' specified. |
ignore_numeric_names |
T/F ignore numeric values in matching series names in 'dt' to series names in 'lib'. This is required for data aggregated using 'process_MF()', as lags of LHS and RHS data are tagged 0 for contemporaneous data, 1 for one lag, 2 for 2 lags, etc. Ignoring these tags insures processing from 'lib' is correctly identified. |
silent |
T/F, supress warnings? |
Details
Process data can be used to transform data to insure stationarity and to censor data for backtesting. Directions for processing each file come from the data.table 'lib'. This table must include the columns 'series_name', 'take_logs', and 'take_diffs'. Unique series may also be identified by a combination of 'country' and 'series_name'. Optional columns include 'needs_SA' for series that need seasonal adjustment, 'detrend' for removing low frequency trends (nowcasting only; detrend should not be used for long horizon forecasts), 'center' to de-mean the data, and 'scale' to scale the data. If the argument to 'process_wide()' of 'detrend', 'center', or 'scale' is 'FALSE', the operation will not be performed. If 'TRUE', the function will check for the column of the same name in 'lib'. If the column exists, T/F entries from this column are used to determine which series to transform. If the column does not exist, all series will be transformed.
Value
data.table of processed values in long format.
Examples
dt <- process(fred, fredlib)
LHS <- fred[series_name == "gdp constant prices"]
RHS <- fred[series_name != "gdp constant prices"]
dtQ <- process_MF(LHS, RHS)
dt_processed <- process(dtQ, fredlib)