ghcm_test {ghcm}R Documentation

Conditional Independence Test using the GHCM

Description

Test whether X is independent of Y given Z using the Generalised Hilbertian Covariance Measure. The function is applied to residuals from regressing each of X and Y on Z respectively. Its validity is contingent on the performance of the regression methods. For a more in-depth explanation see the package vignette or the paper mentioned in the references.

Usage

ghcm_test(
  resid_X_on_Z,
  resid_Y_on_Z,
  X_limits = NULL,
  Y_limits = NULL,
  alpha = 0.05
)

Arguments

resid_X_on_Z, resid_Y_on_Z

Residuals from regressing X (Y) on Z with a suitable regression method. If X (Y) is uni- or multivariate or functional on a constant, fixed grid, the residuals should be supplied as a vector or matrix with no missing values. If instead X (Y) is functional and observed on varying grids or with missing values, the residuals should be supplied as a "melted" data frame with

.obs

Integer indicating which curve the row corresponds to.

.index

Function argument that the curve is evaluated at.

.value

Value of the function.

Note that in the irregular case, a minimum of 4 observations per curve is required.

X_limits, Y_limits

The minimum and maximum values of the function argument of the X (Y) curves. Ignored if X (Y) is not functional.

alpha

Numeric in the unit interval. Significance level of the test.

Value

An object of class ghcm containing:

test_statistic

Numeric, test statistic of the test.

p

Numeric in the unit interval, estimated p-value of the test.

alpha

Numeric in the unit interval, significance level of the test.

reject

TRUE if p < alpha, FALSE otherwise.

References

Please cite the following paper: Anton Rask Lundborg, Rajen D. Shah and Jonas Peters: "Conditional Independence Testing in Hilbert Spaces with Applications to Functional Data Analysis" Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2022 doi:10.1111/rssb.12544.

Examples

if (require(refund)) {
  set.seed(1)
  data(ghcm_sim_data)
  grid <- seq(0, 1, length.out = 101)

# Test independence of two scalars given a functional variable

  m_1 <- pfr(Y_1 ~ lf(Z), data=ghcm_sim_data)
  m_2 <- pfr(Y_2 ~ lf(Z), data=ghcm_sim_data)
  ghcm_test(resid(m_1), resid(m_2))

# Test independence of a regularly observed functional variable and a
# scalar variable given a functional variable
  
    m_X <- pffr(X ~ ff(Z), data=ghcm_sim_data, chunk.size=31000)
    ghcm_test(resid(m_X), resid(m_1))
  
# Test independence of two regularly observed functional variables given
# a functional variable
  
     m_W <- pffr(W ~ ff(Z), data=ghcm_sim_data, chunk.size=31000)
    ghcm_test(resid(m_X), resid(m_W))
  


  data(ghcm_sim_data_irregular)
  n <- length(ghcm_sim_data_irregular$Y_1)
  Z_df <- data.frame(.obs=1:n)
  Z_df$Z <- ghcm_sim_data_irregular$Z
# Test independence of an irregularly observed functional variable and a
# scalar variable given a functional variable
  
    m_1 <- pfr(Y_1 ~ lf(Z), data=ghcm_sim_data_irregular)
    m_X <- pffr(X ~ ff(Z), ydata = ghcm_sim_data_irregular$X,
    data=Z_df, chunk.size=31000)
    ghcm_test(resid(m_X), resid(m_1), X_limits=c(0, 1))
 
# Test independence of two irregularly observed functional variables given
# a functional variable
  
    m_W <- pffr(W ~ ff(Z), ydata = ghcm_sim_data_irregular$W,
    data=Z_df, chunk.size=31000)
    ghcm_test(resid(m_X), resid(m_W), X_limits=c(0, 1), Y_limits=c(0, 1))
 
}


[Package ghcm version 3.0.1 Index]