MINTsemiperm {semidist} | R Documentation |
Mutual information independence test (categorical-continuous case)
Description
Implement the mutual information independence test (MINT) (Berrett and Samworth, 2019), but with some modification in estimating the mutual informaion (MI) between a categorical random variable and a continuous variable. The modification is based on the idea of Ross (2014).
MINTsemiperm()
implements the permutation independence test via
mutual information, but the parameter k
should be pre-specified.
MINTsemiauto()
automatically selects an appropriate k
based on a
data-driven procedure, and conducts MINTsemiperm()
with the k
chosen.
Usage
MINTsemiperm(X, y, k, B = 1000)
MINTsemiauto(X, y, kmax, B1 = 1000, B2 = 1000)
Arguments
X |
Data of multivariate continuous variables, which should be an
|
y |
Data of categorical variables, which should be a factor of length
|
k |
Number of nearest neighbor. See References for details. |
B , B1 , B2 |
Number of permutations to use. Defaults to 1000. |
kmax |
Maximum |
Value
A list with class "indtest"
containing the following components
-
method
: name of the test; -
name_data
: names of theX
andy
; -
n
: sample size of the data; -
num_perm
: number of replications in permutation test; -
stat
: test statistic; -
pvalue
: computed p-value.
For MINTsemiauto()
, the list also contains
-
kmax
: maximumk
in the automatic search for optimalk
; -
kopt
: optimalk
chosen.
References
Berrett, Thomas B., and Richard J. Samworth. "Nonparametric independence testing via mutual information." Biometrika 106, no. 3 (2019): 547-566.
Ross, Brian C. "Mutual information between discrete and continuous data sets." PloS one 9, no. 2 (2014): e87357.
Examples
X <- mtcars[, c("mpg", "disp", "drat", "wt")]
y <- factor(mtcars[, "am"])
MINTsemiperm(X, y, 5)
MINTsemiauto(X, y, kmax = 32)