t_running_centered {fromo}R Documentation

Compare data to moments computed over a time sliding window.

Description

Computes moments over a sliding window, then adjusts the data accordingly, centering, or scaling, or z-scoring, and so on.

Usage

t_running_centered(v, time = NULL, time_deltas = NULL, window = NULL,
  wts = NULL, na_rm = FALSE, min_df = 0L, used_df = 1, lookahead = 0,
  restart_period = 100L, variable_win = FALSE, wts_as_delta = TRUE,
  check_wts = FALSE, normalize_wts = TRUE)

t_running_scaled(v, time = NULL, time_deltas = NULL, window = NULL,
  wts = NULL, na_rm = FALSE, min_df = 0L, used_df = 1, lookahead = 0,
  restart_period = 100L, variable_win = FALSE, wts_as_delta = TRUE,
  check_wts = FALSE, normalize_wts = TRUE)

t_running_zscored(v, time = NULL, time_deltas = NULL, window = NULL,
  wts = NULL, na_rm = FALSE, min_df = 0L, used_df = 1, lookahead = 0,
  restart_period = 100L, variable_win = FALSE, wts_as_delta = TRUE,
  check_wts = FALSE, normalize_wts = TRUE)

t_running_sharpe(v, time = NULL, time_deltas = NULL, window = NULL,
  wts = NULL, lb_time = NULL, na_rm = FALSE, compute_se = FALSE,
  min_df = 0L, used_df = 1, restart_period = 100L, variable_win = FALSE,
  wts_as_delta = TRUE, check_wts = FALSE, normalize_wts = TRUE)

t_running_tstat(v, time = NULL, time_deltas = NULL, window = NULL,
  wts = NULL, lb_time = NULL, na_rm = FALSE, compute_se = FALSE,
  min_df = 0L, used_df = 1, restart_period = 100L, variable_win = FALSE,
  wts_as_delta = TRUE, check_wts = FALSE, normalize_wts = TRUE)

Arguments

v

a vector of data.

time

an optional vector of the timestamps of v. If given, must be the same length as v. If not given, we try to infer it by summing the time_deltas.

time_deltas

an optional vector of the deltas of timestamps. If given, must be the same length as v. If not given, and wts are given and wts_as_delta is true, we take the wts as the time deltas. The deltas must be positive. We sum them to arrive at the times.

window

the window size, in time units. if given as finite integer or double, passed through. If NULL, NA_integer_, NA_real_ or Inf are given, and variable_win is true, then we infer the window from the lookback times: the first window is infinite, but the remaining is the deltas between lookback times. If variable_win is false, then these undefined values are equivalent to an infinite window. If negative, an error will be thrown.

wts

an optional vector of weights. Weights are ‘replication’ weights, meaning a value of 2 is shorthand for having two observations with the corresponding v value. If NULL, corresponds to equal unit weights, the default. Note that weights are typically only meaningfully defined up to a multiplicative constant, meaning the units of weights are immaterial, with the exception that methods which check for minimum df will, in the weighted case, check against the sum of weights. For this reason, weights less than 1 could cause NA to be returned unexpectedly due to the minimum condition. When weights are NA, the same rules for checking v are applied. That is, the observation will not contribute to the moment if the weight is NA when na_rm is true. When there is no checking, an NA value will cause the output to be NA.

na_rm

whether to remove NA, false by default.

min_df

the minimum df to return a value, otherwise NaN is returned. This can be used to prevent e.g. Z-scores from being computed on only 3 observations. Defaults to zero, meaning no restriction, which can result in infinite Z-scores during the burn-in period.

used_df

the number of degrees of freedom consumed, used in the denominator of the centered moments computation. These are subtracted from the number of observations.

lookahead

for some of the operations, the value is compared to mean and standard deviation possibly using 'future' or 'past' information by means of a non-zero lookahead. Positive values mean data are taken from the future. This is in time units, and so should be a real.

restart_period

the recompute period. because subtraction of elements can cause loss of precision, the computation of moments is restarted periodically based on this parameter. Larger values mean fewer restarts and faster, though less accurate results.

variable_win

if true, and the window is not a concrete number, the computation window becomes the time between lookback times.

wts_as_delta

if true and the time and time_deltas are not given, but wts are given, we take wts as the time_deltas.

check_wts

a boolean for whether the code shall check for negative weights, and throw an error when they are found. Default false for speed.

normalize_wts

a boolean for whether the weights should be renormalized to have a mean value of 1. This mean is computed over elements which contribute to the moments, so if na_rm is set, that means non-NA elements of wts that correspond to non-NA elements of the data vector.

lb_time

a vector of the times from which lookback will be performed. The output should be the same size as this vector. If not given, defaults to time.

compute_se

for running_sharpe, return an extra column of the standard error, as computed by Mertens' correction.

Details

Given the length nn vector xx, for a given index ii, define x(i)x^{(i)} as the elements of xx defined by the sliding time window (see the section on time windowing). Then define μi\mu_i, σi\sigma_i and nin_i as, respectively, the sample mean, standard deviation and number of non-NA elements in x(i)x^{(i)}.

We compute output vector mm the same size as xx. For the 'centered' version of xx, we have mi=xiμim_i = x_i - \mu_i. For the 'scaled' version of xx, we have mi=xi/σim_i = x_i / \sigma_i. For the 'z-scored' version of xx, we have mi=(xiμi)/σim_i = (x_i - \mu_i) / \sigma_i. For the 't-scored' version of xx, we have mi=niμi/σim_i = \sqrt{n_i} \mu_i / \sigma_i.

We also allow a 'lookahead' for some of these operations. If positive, the moments are computed using data from larger indices; if negative, from smaller indices.

Value

a vector the same size as the input consisting of the adjusted version of the input. When there are not sufficient (non-nan) elements for the computation, NaN are returned.

Time Windowing

This function supports time (or other counter) based running computation. Here the input are the data xix_i, and optional weights vectors, wiw_i, defaulting to 1, and a vector of time indices, tit_i of the same length as xx. The times must be non-decreasing:

t1t2t_1 \le t_2 \le \ldots

It is assumed that t0=t_0 = -\infty. The window, WW is now a time-based window. An optional set of lookback times are also given, bjb_j, which may have different length than the xx and ww. The output will correspond to the lookback times, and should be the same length. The jjth output is computed over indices ii such that

bjW<tibj.b_j - W < t_i \le b_j.

For comparison functions (like Z-score, rescaling, centering), which compare values of xix_i to local moments, the lookbacks may not be given, but a lookahead LL is admitted. In this case, the jjth output is computed over indices ii such that

tjW+L<titj+L.t_j - W + L < t_i \le t_j + L.

If the times are not given, ‘deltas’ may be given instead. If δi\delta_i are the deltas, then we compute the times as

ti=1jiδj.t_i = \sum_{1 \le j \le i} \delta_j.

The deltas must be the same length as xx. If times and deltas are not given, but weights are given and the ‘weights as deltas’ flag is set true, then the weights are used as the deltas.

Some times it makes sense to have the computational window be the space between lookback times. That is, the jjth output is to be computed over indices ii such that

bj1W<tibj.b_{j-1} - W < t_i \le b_j.

This can be achieved by setting the ‘variable window’ flag true and setting the window to null. This will not make much sense if the lookback times are equal to the times, since each moment computation is over a set of a single index, and most moments are underdefined.

Note

The moment computations provided by fromo are numerically robust, but will often not provide the same results as the 'standard' implementations, due to differences in roundoff. We make every attempt to balance speed and robustness. User assumes all risk from using the fromo package.

Note that when weights are given, they are treated as replication weights. This can have subtle effects on computations which require minimum degrees of freedom, since the sum of weights will be compared to that minimum, not the number of data points. Weight values (much) less than 1 can cause computations to return NA somewhat unexpectedly due to this condition, while values greater than one might cause the computation to spuriously return a value with little precision.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Terriberry, T. "Computing Higher-Order Moments Online." http://people.xiph.org/~tterribe/notes/homs.html

J. Bennett, et. al., "Numerically Stable, Single-Pass, Parallel Statistics Algorithms," Proceedings of IEEE International Conference on Cluster Computing, 2009. https://www.semanticscholar.org/paper/Numerically-stable-single-pass-parallel-statistics-Bennett-Grout/a83ed72a5ba86622d5eb6395299b46d51c901265

Cook, J. D. "Accurately computing running variance." http://www.johndcook.com/standard_deviation.html

Cook, J. D. "Comparing three methods of computing standard deviation." http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation

See Also

running_centered, scale


[Package fromo version 0.2.1 Index]