Median Absolute Deviation of Moving Windows

Description

Moving (aka running, rolling) Window MAD (Median Absolute Deviation) calculated over a vector

Usage

runmad(x, k, center = runmed(x,k), constant = 1.4826,
endrule=c("mad", "NA", "trim", "keep", "constant", "func"),
align = c("center", "left", "right"))

Details

Apart from the end values, the result of y = runmad(x, k) is the same as “for(j=(1+k2):(n-k2)) y[j]=mad(x[(j-k2):(j+k2)], na.rm = TRUE)”. It can handle non-finite numbers like NaN's and Inf's (like “mad(x, na.rm = TRUE)”).

The main incentive to write this set of functions was relative slowness of majority of moving window functions available in R and its packages. With the exception of runmed, a running window median function, all functions listed in "see also" section are slower than very inefficient “apply(embed(x,k),1,FUN)” approach.

Functions runquantile and runmad are using insertion sort to sort the moving window, but gain speed by remembering results of the previous sort. Since each time the window is moved, only one point changes, all but one points in the window are already sorted. Insertion sort can fix that in O(k) time.

Value

Returns a numeric vector or matrix of the same size as x. Only in case of endrule="trim" the output vectors will be shorter and output matrices will have fewer rows.

Author(s)

Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com

References

• Other moving window functions from this package: runmin, runmax, runquantile, runmean and runsd

• generic running window functions: apply (embed(x,k), 1, FUN) (fastest), running from gtools package (extremely slow for this purpose), subsums from magic library can perform running window operations on data with any dimensions.

Examples

# show runmed function
k=25; n=200;
x = rnorm(n,sd=30) + abs(seq(n)-n/4)
col = c("black", "red", "green")
m=runmed(x, k)
plot(x, col=col, main = "Moving Window Analysis Functions")
lines(m    , col=col)
lines(m-y/2, col=col)
lines(m+y/2, col=col)
legend(0,0.9*n, lab, col=col, lty=1 )

# basic tests against apply/embed
eps = .Machine\$double.eps ^ 0.5
k=25 # odd size window
stopifnot(all(abs(a-b)<eps));
k=24 # even size window
stopifnot(all(abs(a-b)<eps));

# test against loop approach
# this test works fine at the R prompt but fails during package check - need to investigate
k=24; n=200;
x = rnorm(n,sd=30) + abs(seq(n)-n/4) # create random data
x = rep(1:5,40)
#x[seq(1,n,11)] = NaN;               # commented out for time beeing - on to do list
#x = NaN;                         # commented out for time beeing - on to do list
k2 = k
k1 = k-k2-1
ac = array(runquantile(x,k,0.5))
bc = array(0,n)
b  = array(0,n)
for(j in 1:n) {
lo = max(1, j-k1)
hi = min(n, j+k2)
bc[j] = median(x[lo:hi], na.rm = TRUE)
b [j] = mad   (x[lo:hi], na.rm = TRUE, center=bc[j])
}
eps = .Machine\$double.eps ^ 0.5
#stopifnot(all(abs(ac-bc)<eps)); # commented out for time beeing - on to do list
#stopifnot(all(abs(a-b)<eps));   # commented out for time beeing - on to do list

# compare calculation at array ends
k=25; n=200;
x = rnorm(n,sd=30) + abs(seq(n)-n/4)
c = runquantile(x,k,0.5,type=2)             # find the center
b = runmad(x, k, center=c, endrule="func")  # slow R code
stopifnot(all(abs(a-b)<eps));

# test if moving windows forward and backward gives the same results
k=51;
stopifnot(all(a[n:1]==b, na.rm=TRUE));

# test vector vs. matrix inputs, especially for the edge handling
nRow=200; k=25; nCol=10
x = rnorm(nRow,sd=30) + abs(seq(nRow)-n/4)
X = matrix(rep(x, nCol ), nRow, nCol) # replicate x in columns of X
a = runmad(x, k, center = runquantile(x,k,0.5,type=2))
b = runmad(X, k, center = runquantile(X,k,0.5,type=2))
stopifnot(all(abs(a-b[,1])<eps));        # vector vs. 2D array
stopifnot(all(abs(b[,1]-b[,nCol])<eps)); # compare rows within 2D array

# speed comparison
## Not run:
x=runif(1e5); k=51;                       # reduce vector and window sizes