hsic.clust {kpcalg}R Documentation

HSIC cluster permutation conditional independence test

Description

Conditional independence test using HSIC and permutation with clusters.

Usage

hsic.clust(x, y, z, sig = 1, p = 100, numCluster = 10, numCol = 50,
  eps = 0.1, paral = 1)

Arguments

x

first variable

y

second variable

z

set of variables on which we condition

sig

the with of the Gaussian kernel

p

the number of permutations

numCluster

number of clusters for clustering z

numCol

maximum number of columns that we use for the incomplete Cholesky decomposition

eps

normalization parameter for HSIC cluster test

paral

number of cores used

Details

Let x and y be two samples of length n. Gram matrices K and L are defined as: K_{i,j} = \exp\frac{(x_i-x_j)^2}{\sigma^2}, L_{i,j} = \exp\frac{(y_i-y_j)^2}{\sigma^2} and M_{i,j} = \exp\frac{(z_i-z_j)^2}{\sigma^2}. H_{i,j} = \delta_{i,j} - \frac{1}{n}. Let A=HKH, B=HLH and C=HMH. HSIC(X,Y|Z) = \frac{1}{n^2}Tr(AB-2AC(C+\epsilon I)^{-2}CB+AC(C+\epsilon I)^{-2}CBC(C+\epsilon I)^{-2}C). Permutation test clusters Z and then permutes Y in the clusters of Z p times to get Y_{(p)} and calculates HSIC(X,Y_{(p)}|Z). pval = \frac{1(HSIC(X,Y|Z)>HSIC(Z,Y_{(p)}|Z))}{p}.

Value

hsic.clust() returns a list with class htest containing

method

description of test

statistic

observed value of the test statistic

estimate

HSIC(x,y)

estimates

a vector: [HSIC(x,y), mean of HSIC(x,y), variance of HSIC(x,y)]

replicates

replicates of the test statistic

p.value

approximate p-value of the test

data.name

desciption of data

Author(s)

Petras Verbyla (petras.verbyla@mrc-bsu.cam.ac.uk) and Nina Ines Bertille Desgranges

References

Tillman, R. E., Gretton, A. and Spirtes, P. (2009). Nonlinear directed acyclic structure learning with weakly additive noise model. NIPS 22, Vancouver.

K. Fukumizu et al. (2007). Kernel Measures of Conditional Dependence. NIPS 20. https://papers.nips.cc/paper/3340-kernel-measures-of-conditional-dependence.pdf

See Also

hsic.gamma, hsic.perm, kernelCItest

Examples

library(energy)
set.seed(10)
# x and y dependent, but independent conditionally on z
z <- 10*runif(300)
x <- sin(z) + runif(300)
y <- cos(z) + runif(300)
plot(x,y)
hsic.gamma(x,y)
hsic.perm(x,y)
dcov.test(x,y)
hsic.clust(x,y,z)

[Package kpcalg version 1.0.1 Index]