KPCgraph {KPC}R Documentation

Kernel partial correlation with geometric graphs

Description

Calculate the kernel partial correlation (KPC) coefficient with directed K-nearest neighbor (K-NN) graph or minimum spanning tree (MST).

Usage

KPCgraph(
  Y,
  X,
  Z,
  k = kernlab::rbfdot(1/(2 * stats::median(stats::dist(Y))^2)),
  Knn = 1,
  trans_inv = FALSE
)

Arguments

Y

a matrix (n by dy)

X

a matrix (n by dx) or NULL if XX is empty

Z

a matrix (n by dz)

k

a function k(y,y)k(y, y') of class kernel. It can be the kernel implemented in kernlab e.g., Gaussian kernel: rbfdot(sigma = 1), linear kernel: vanilladot().

Knn

a positive integer indicating the number of nearest neighbor to use; or "MST". A small Knn (e.g., Knn=1) is recommended for an accurate estimate of the population KPC.

trans_inv

TRUE or FALSE. Is k(y,y)k(y, y) free of yy?

Details

The kernel partial correlation squared (KPC) measures the conditional dependence between YY and ZZ given XX, based on an i.i.d. sample of (Y,Z,X)(Y, Z, X). It converges to the population quantity (depending on the kernel) which is between 0 and 1. A small value indicates low conditional dependence between YY and ZZ given XX, and a large value indicates stronger conditional dependence. If X == NULL, it returns the KMAc(Y,Z,k,Knn), which measures the unconditional dependence between YY and ZZ. Euclidean distance is used for computing the K-NN graph and the MST. MST in practice often achieves similar performance as the 2-NN graph. A small K is recommended for the K-NN graph for an accurate estimate of the population KPC, while if KPC is used as a test statistic for conditional independence, a larger K can be beneficial.

Value

The algorithm returns a real number which is the estimated KPC.

See Also

KPCRKHS, KMAc, Klin

Examples

library(kernlab)
n = 2000
x = rnorm(n)
z = rnorm(n)
y = x + z + rnorm(n,1,1)
KPCgraph(y,x,z,vanilladot(),Knn=1,trans_inv=FALSE)

n = 1000
x = runif(n)
z = runif(n)
y = (x + z) %% 1
KPCgraph(y,x,z,rbfdot(5),Knn="MST",trans_inv=TRUE)

discrete_ker = function(y1,y2) {
    if (y1 == y2) return(1)
    return(0)
}
class(discrete_ker) <- "kernel"
set.seed(1)
n = 2000
x = rnorm(n)
z = rnorm(n)
y = rep(0,n)
for (i in 1:n) y[i] = sample(c(1,0),1,prob = c(exp(-z[i]^2/2),1-exp(-z[i]^2/2)))
KPCgraph(y,x,z,discrete_ker,1)
##0.330413

[Package KPC version 0.1.2 Index]