sd_test {semidist} | R Documentation |
Semi-distance independence test
Description
Implement the semi-distance independence test via permutation
test, or via the asymptotic approximation when the dimensionality of
continuous variables p
is high.
Usage
sd_test(X, y, test_type = "perm", num_perm = 10000)
Arguments
X |
Data of multivariate continuous variables, which should be an
|
y |
Data of categorical variables, which should be a factor of length
|
test_type |
Type of the test:
See the Reference for details. |
num_perm |
The number of replications in permutation test. Defaults to 10000. See Details and Reference. |
Details
The semi-distance independence test statistic is
T_n = n \cdot
\widetilde{\text{SDcov}}_n(X, y),
where the
\widetilde{\text{SDcov}}_n(X, y)
can be computed by sdcov(X, y, type = "U")
.
For the permutation test (test_type = "perm"
), totally K
replications of permutation will be conducted, and the argument num_perm
specifies the K
here. The p-value of permutation test is computed by
\text{p-value} = (\sum_{k=1}^K I(T^{\ast (k)}_{n} \ge T_{n}) + 1) /
(K + 1),
where T_{n}
is the semi-distance test statistic and
T^{\ast (k)}_{n}
is the test statistic with k
-th permutation
sample.
When the dimension of the continuous variables is high, the asymptotic
approximation approach can be applied (test_type = "asym"
), which is
computationally faster since no permutation is needed.
Value
A list with class "indtest"
containing the following components
-
method
: name of the test; -
name_data
: names of theX
andy
; -
n
: sample size of the data; -
test_type
: type of the test; -
num_perm
: number of replications in permutation test, iftest_type = "perm"
; -
stat
: test statistic; -
pvalue
: computed p-value.
See Also
sdcov()
for computing the statistic of semi-distance covariance.
Examples
X <- mtcars[, c("mpg", "disp", "drat", "wt")]
y <- factor(mtcars[, "am"])
test <- sd_test(X, y)
print(test)
# Man-made independent data -------------------------------------------------
n <- 30; R <- 5; p <- 3; prob <- rep(1/R, R)
X <- matrix(rnorm(n*p), n, p)
y <- factor(sample(1:R, size = n, replace = TRUE, prob = prob), levels = 1:R)
test <- sd_test(X, y)
print(test)
# Man-made functionally dependent data --------------------------------------
n <- 30; R <- 3; p <- 3
X <- matrix(0, n, p)
X[1:10, 1] <- 1; X[11:20, 2] <- 1; X[21:30, 3] <- 1
y <- factor(rep(1:3, each = 10))
test <- sd_test(X, y)
print(test)
#' Man-made high-dimensionally independent data -----------------------------
n <- 30; R <- 3; p <- 100
X <- matrix(rnorm(n*p), n, p)
y <- factor(rep(1:3, each = 10))
test <- sd_test(X, y)
print(test)
test <- sd_test(X, y, test_type = "asym")
print(test)
# Man-made high-dimensionally dependent data --------------------------------
n <- 30; R <- 3; p <- 100
X <- matrix(0, n, p)
X[1:10, 1] <- 1; X[11:20, 2] <- 1; X[21:30, 3] <- 1
y <- factor(rep(1:3, each = 10))
test <- sd_test(X, y)
print(test)
test <- sd_test(X, y, test_type = "asym")
print(test)