gmcmtxBlk {generalCorr}R Documentation

Matrix R* of generalized correlation coefficients captures nonlinearities using blocks.

Description

The algorithm uses two auxiliary functions, getSeq and NLhat. The latter uses the kern function to kernel regress x on y, and conversely y on x. It needs the package ‘np,’ which reports residuals and allows one to compute fitted values (xhat, yhat). Unlike gmcmtx0, this function considers blocks of blksiz=10 (default) pairs of data points separately with distinct bandwidths for each block, usually creating superior local fits.

Usage

gmcmtxBlk(mym, nam = colnames(mym), blksiz = 10)

Arguments

mym

A matrix of data on selected variables arranged in columns

nam

Column names of the variables in the data matrix

blksiz

block size, default=10, if chosen blksiz >n, where n=rows in matrix then blksiz=n. That is, no blocking is done

Details

This function does pairwise checks of missing data for all pairs. Assume that there are n rows in the input matrix ‘mym’ with some missing rows. If the columns of mym are denoted (X1, X2, ...Xp), we are considering all pairs (Xi, Xj), treated as (x, y), with ‘nv’ number of valid (non-missing) rows Note that each x and y is an (nv by 1) vector. This function further splits these (x, y) vectors into as many subgroups or blocks as are needed for the nv paired valid data points for the chosen block length (blksiz)

Next, the algorithm strings together various blocks of fitted value vectors (xhat, yhat) also of dimension nv by 1. Now for each pair of Xi Xj (column Xj= cause, row Xi=response, treated as x and y), the algorithm computes R*ij the simple Pearson correlation coefficient between (x, xhat) and as R*ji the correlation coeff. between (y, yhat). Next, it assigns |R*ij| and |R*ji| the observed sign of the Pearson correlation coefficient between x and y.

Its advantages discussed in Vinod (2015, 2019) are: (i) It is asymmetric yielding causal direction information, by relaxing the assumption of linearity implicit in usual correlation coefficients. (ii) The R* correlation coefficients are generally larger upon admitting arbitrary nonlinearities. (iii) max(|R*ij|, |R*ji|) measures (nonlinear) dependence. For example, let x=1:20 and y=sin(x). This y has a perfect (100 percent) nonlinear dependence on x and yet Pearson correlation coefficient r(x y)= -0.0948372 is near zero, and its 95% confidence interval (-0.516, 0.363) includes zero, implying that the population r(x,y) is not significantly different from zero. This example highlights a serious failure of the traditional r(x,y) in measuring dependence between x and y when nonlinearities are present. gmcmtx0 without blocking does work if x=1:n, and y=f(x)=sin(x) is used with n<20. But for larger n, the fixed bandwidth used by the kern function becomes a problem. The block version has additional bandwidths for each block, and hence it correctly quantifies the presence of high dependence even when x=1:n, and y=f(x) are defined for large n and complicated nonlinear functional forms for f(x).

Value

A non-symmetric R* matrix of generalized correlation coefficients

Author(s)

Prof. H. D. Vinod, Economics Dept., Fordham University, NY

References

Vinod, H. D.'Generalized Correlation and Kernel Causality with Applications in Development Economics' in Communications in Statistics -Simulation and Computation, 2015, doi:10.1080/03610918.2015.1122048

Vinod, H. D. 'Matrix Algebra Topics in Statistics and Economics Using R', Chapter 4 in 'Handbook of Statistics: Computational Statistics with R', Vol.32, co-editors: M. B. Rao and C.R. Rao. New York: North Holland, Elsevier Science Publishers, 2014, pp. 143-176.

Vinod, H. D. 'New exogeneity tests and causal paths,' Chapter 2 in 'Handbook of Statistics: Conceptual Econometrics Using R', Vol.32, co-editors: H. D. Vinod and C.R. Rao. New York: North Holland, Elsevier Science Publishers, 2019, pp. 33-64.

Zheng, S., Shi, N.-Z., and Zhang, Z. (2012). 'Generalized measures of correlation for asymmetry, nonlinearity, and beyond,' Journal of the American Statistical Association, vol. 107, pp. 1239-1252.

Examples

 
## Not run: 
x=1:20; y=sin(x)
gmcmtxBlk(cbind(x,y),blksiz=10)
## End(Not run)


[Package generalCorr version 1.2.6 Index]