ds_test {dslice} | R Documentation |
Hypothesis testing via dynamic slicing
Description
Perform a one- or K-sample () hypothesis testing via dynamic slicing.
Usage
ds_test(y, x, ..., type = c("ds", "eqp"), lambda = 1, alpha = 1, rounds = 0)
Arguments
y |
A numeric vector of data values. |
x |
Either an integer vector of data values, from 0 to |
... |
Parameters of the distribution specified (as a character string) by |
type |
Methods applied for dynamic slicing. " |
lambda |
Penalty for introducing an additional slice, which is used to avoid making too many slices. It corresponds to the type I error under the scenario that the two variables are independent. |
alpha |
Penalty required for " |
rounds |
Number of permutations for estimating empirical p-value. |
Details
If x
is an integer vector, ds_test
performs K-sample test ().
Under this scenario, suppose that there are observations y
drawn from some continuous populations. Let x
be a vector that stores values of indicator of samples from different populations, i.e., x
has values . The null hypothesis is that these populations have the same distribution.
If x
is a character string naming a continuous (cumulative) distribution function, ds_test
performs one-sample test with the null hypothesis that the distribution function which generated y
is distribution x
with parameters specified by . The parameters specified in
must be pre-specified and not estimated from the data.
Only empirical p-values are available by specifying the value of parameter rounds
, the number of permutation. lambda
and alpha
(for one-sample test with type "ds
") contributes to p-value.
The procedure of choosing parameter lambda
was described in Jiang, Ye & Liu (2015). Refer to dataset ds_type_one_error
in this package for the empirical relationship of lambda
, sample size and type I error.
Value
A list with class "htest
" containing the following components:
statistic |
The value of the dynamic slicing statistic. |
p.value |
The p-value of the test. |
alternative |
A character string describing the alternative hypothesis. |
method |
A character string indicating what type of test was performed. |
data.name |
A character string giving the name(s) of the data. |
slices |
Slicing strategy that maximize dynamic slicing statistic in K-sample test. Each row stands for a slice. Each column except the last one stands for the number of observations take each value in each slice. The last column is the number of observations in each slice i.e., the sum of the first column to the kth column. |
References
Jiang, B., Ye, C. and Liu, J.S. Non-parametric K-sample tests via dynamic slicing. Journal of the American Statistical Association, 110(510): 642-653, 2015.
Examples
## One-sample test
n <- 100
mu <- 0.5
y <- rnorm(n, mu, 1)
lambda <- 1.0
alpha <- 1.0
dsres <- ds_test(y, "pnorm", 0, 1, lambda = 1, alpha = 1, rounds = 100)
dsres <- ds_test(y, "pnorm", 0, 1, type = "ds", lambda = 1, alpha = 1)
dsres <- ds_test(y, "pnorm", 0, 1, type = "eqp", lambda = 1, rounds = 100)
dsres <- ds_test(y, "pnorm", 0, 1, type = "eqp", lambda = 1)
## K-sample test
n <- 100
mu <- 0.5
y <- c(rnorm(n, -mu, 1), rnorm(n, mu, 1))
## generate x in this way:
x <- c(rep(0, n), rep(1, n))
x <- as.integer(x)
## or in this way:
x <- c(rep("G1", n), rep("G2", n))
x <- relabel(x)
lambda <- 1.0
dsres <- ds_test(y, x, lambda = 1, rounds = 100)
dsres <- ds_test(y, x, type = "eqp", lambda = 1, rounds = 100)