ks_test {Ecume} | R Documentation |
Weighted Kolmogorov-Smirnov Two-Sample Test with threshold
ks_test(x, y, thresh = 0.05, w_x = rep(1, length(x)), w_y = rep(1, length(y)))
x |
Vector of values sampled from the first distribution |
y |
Vector of values sampled from the second distribution |
thresh |
The threshold needed to clear between the two cumulative distributions |
w_x |
The observation weights for x |
w_y |
The observation weights for y |
The usual Kolmogorov-Smirnov test for two vectors X and Y, of size m and n rely on the empirical cdfs E_x and E_y and the test statistic
D = sup_{t\in (X, Y)} |E_x(x) - E_y(x))
. This modified Kolmogorov-Smirnov test relies on two modifications.
Using observation weights for both vectors X and Y: Those weights are used in two places, while modifying the usual KS test. First, the empirical cdfs are updates to account for the weights. Secondly, the effective sample sizes are also modified. This is inspired from https://stackoverflow.com/a/55664242/13768995, using Monahan (2011).
Testing against a threshold: the test statistic is thresholded such that D = max(D - thresh, 0). Since 0≤ D≤ 1, the value of the threshold is also between 0 and 1, representing an effect size for the difference.
A list with class "htest"
containing the following components:
statistic the value of the test statistic.
p.value the p-value of the test.
alternative a character string describing the alternative hypothesis.
method a character string indicating what type of test was performed.
data.name a character string giving the name(s) of the data.
Monahan, J. (2011). Numerical Methods of Statistics (2nd ed., Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511977176
x <- runif(100) y <- runif(100, min = .5, max = .5) ks_test(x, y, thresh = .001)