.runquantile {diveMove} | R Documentation |
Quantile of Moving Window
Description
Moving (aka running, rolling) Window Quantile calculated over a vector
Usage
.runquantile(
x,
k,
probs,
type = 7,
endrule = c("quantile", "NA", "trim", "keep", "constant", "func"),
align = c("center", "left", "right")
)
Arguments
x |
numeric vector of length n or matrix with n rows. If |
k |
width of moving window; must be an integer between one and n. |
probs |
numeric vector of probabilities with values in [0,1] range
used by |
type |
an integer between 1 and 9 selecting one of the nine
quantile algorithms, same as |
endrule |
character string indicating how the values at the beginning and the
end, of the array, should be treated. Only first and last * |
align |
specifies whether result should be centered (default),
left-aligned or right-aligned. If |
Details
Apart from the end values, the result of y = runquantile(x, k) is the
same as “for(j=(1+k2):(n-k2))
y[j]=quintile(x[(j-k2):(j+k2)],na.rm = TRUE)
”. It can handle
non-finite numbers like NaN's and Inf's (like quantile(x,
na.rm = TRUE)
).
The main incentive to write this set of functions was relative slowness
of majority of moving window functions available in R and its packages.
All functions listed in "see also" section are slower than very
inefficient “apply(embed(x,k),1,FUN)
”
approach. Relative speeds of runquantile
is O(n*k)
Function runquantile
uses insertion sort to sort the moving
window, but gain speed by remembering results of the previous
sort. Since each time the window is moved, only one point changes, all
but one points in the window are already sorted. Insertion sort can fix
that in O(k) time.
Value
If x
is a matrix than function runquantile
returns a
matrix of size [n \times
length
(probs)]. If
x
is vactor a than function runquantile
returns a matrix
of size [dim
(x) \times
length
(probs)]. If endrule="trim"
the output will
have fewer rows.
Author(s)
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
References
About quantiles: Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in statistical packages, American Statistician, 50, 361.
About quantiles: Eric W. Weisstein. Quantile. From MathWorld– A Wolfram Web Resource. http://mathworld.wolfram.com/Quantile.html
About insertion sort used in runmad
and runquantile
: R.
Sedgewick (1988): Algorithms. Addison-Wesley (page 99)
Examples
## show plot using runquantile
k <- 31; n <- 200
x <- rnorm(n, sd=30) + abs(seq(n)-n/4)
y <- diveMove:::.runquantile(x, k, probs=c(0.05, 0.25, 0.5, 0.75, 0.95))
col <- c("black", "red", "green", "blue", "magenta", "cyan")
plot(x, col=col[1], main="Moving Window Quantiles")
lines(y[,1], col=col[2])
lines(y[,2], col=col[3])
lines(y[,3], col=col[4])
lines(y[,4], col=col[5])
lines(y[,5], col=col[6])
lab=c("data", "runquantile(.05)", "runquantile(.25)", "runquantile(0.5)",
"runquantile(.75)", "runquantile(.95)")
legend(0,230, lab, col=col, lty=1)
## basic tests against apply/embed
a <- diveMove:::.runquantile(x, k, c(0.3, 0.7), endrule="trim")
b <- t(apply(embed(x, k), 1, quantile, probs=c(0.3, 0.7)))
eps <- .Machine$double.eps ^ 0.5
stopifnot(all(abs(a - b) < eps))
## Test against loop approach
## This test works fine at the R prompt but fails during package check -
## need to investigate
k <- 25; n <- 200
x <- rnorm(n, sd=30) + abs(seq(n) - n / 4) # create random data
x[seq(1, n, 11)] <- NaN; # add NANs
k2 <- k %/% 2
k1 <- k - k2 - 1
a <- diveMove:::.runquantile(x, k, probs=c(0.3, 0.8))
b <- matrix(0, n, 2)
for(j in 1:n) {
lo <- max(1, j - k1)
hi <- min(n, j + k2)
b[j, ] <- quantile(x[lo:hi], probs=c(0.3, 0.8), na.rm=TRUE)
}
## stopifnot(all(abs(a-b)<eps));
## Compare calculation of array ends
a <- diveMove:::.runquantile(x, k, probs=0.4,
endrule="quantile") # fast C code
b <- diveMove:::.runquantile(x, k, probs=0.4,
endrule="func") # slow R code
stopifnot(all(abs(a - b) < eps))
## Test if moving windows forward and backward gives the same results
k <- 51
a <- diveMove:::.runquantile(x, k, probs=0.4)
b <- diveMove:::.runquantile(x[n:1], k, probs=0.4)
stopifnot(all(a[n:1]==b, na.rm=TRUE))
## Test vector vs. matrix inputs, especially for the edge handling
nRow <- 200; k <- 25; nCol <- 10
x <- rnorm(nRow, sd=30) + abs(seq(nRow) - n / 4)
x[seq(1, nRow, 10)] <- NaN # add NANs
X <- matrix(rep(x, nCol), nRow, nCol) # replicate x in columns of X
a <- diveMove:::.runquantile(x, k, probs=0.6)
b <- diveMove:::.runquantile(X, k, probs=0.6)
stopifnot(all(abs(a - b[, 1]) < eps)) # vector vs. 2D array
stopifnot(all(abs(b[, 1] - b[, nCol]) < eps)) # compare rows within 2D array
## Exhaustive testing of runquantile to standard R approach
numeric.test <- function (x, k) {
probs <- c(1, 25, 50, 75, 99) / 100
a <- diveMove:::.runquantile(x, k, c(0.3, 0.7), endrule="trim")
b <- t(apply(embed(x, k), 1, quantile, probs=c(0.3, 0.7), na.rm=TRUE))
eps <- .Machine$double.eps ^ 0.5
stopifnot(all(abs(a - b) < eps))
}
n <- 50
x <- rnorm(n,sd=30) + abs(seq(n) - n / 4) # nice behaving data
for(i in 2:5) numeric.test(x, i) # test small window sizes
for(i in 1:5) numeric.test(x, n - i + 1) # test large window size
x[seq(1, 50, 10)] <- NaN # add NANs and repet the test
for(i in 2:5) numeric.test(x, i) # test small window sizes
for(i in 1:5) numeric.test(x, n - i + 1) # test large window size
## Speed comparison
## Not run:
x <- runif(1e6); k=1e3 + 1
system.time(diveMove:::.runquantile(x, k, 0.5)) # Speed O(n*k)
## End(Not run)