GiniDistance-package {GiniDistance}R Documentation

GiniDistance

Description

A new Gini correlation to measure dependence between categorical and numerical variables are implemented. Analogous to Pearson in ANOVA model, the Gini correlation is interpreted as the ratio of the between-group variation and the total variation, but it characterizes independence (zero Gini correlation mutually implies independence). Closely related to the distance correlation, the Gini correlation is of the simple formulation by considering the nature of the categorical variable. As a result, the Gini correlation has a lower computational cost than the distance correlation and is more straightforward to perform inference. The dependence test and confidence interval are implemented. Also, the corresponding kernelized dependence measures are also implemented.

Details

The details are described in the following papers "A new Gini correlation between quantitative and qualitative variables" and "Estimating Feature-Label Dependence Using Gini Distance Statistics"

Author(s)

Dao Nguyen dxnguyen@olemiss.edu and Xin Dang xdang@olemiss.edu

References

Dang, X., Nguyen, D., Chen, Y. and Zhang, J., (2019). A new Gini correlation between quantitative and qualitative variables, Journal of the American Statistical Association (submitted), https://arxiv.org/pdf/1809.09793.pdf

Zhang, S., Dang, X., Nguyen, D. and Chen, Y. (2019). Estimating feature - label dependence using Gini distance statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence (submitted), https://arXiv.org/pdf/1906.02171.pdf


[Package GiniDistance version 0.1.1 Index]