R: Count and continuous generalized variability indexes

GWI-package {GWI}

R Documentation

Count and continuous generalized variability indexes

Description

Univariate Poisson dispersion index di.fun, univariate exponential variation index vi.fun functions are performed. Next, the univariate binomial dispersion index dib.fun, the univariate negative binomial dispersion index dinb.fun and the univariate inverse Gaussian variation index viiG.fun functions are given. Finally, the generalized dispersion index and its marginal one gmdi.fun , the generalized variation index and its marginal one gmvi.fun functions are displayed.

Details

The univariate Poisson dispersion index (DI) and its relative versions with respect to binomial and negative binomial distributions:

The Poisson dispersion phenomenon is well-known and very widely used in practice; see, e.g., Kokonendji (2014) for a review of count (or discrete integer-valued) models. There are many interpretable mechanisms leading to this phenomenon which makes it possible to classify count distributions and make inference; see, e.g., Mizère et al. (2006) and Touré et al. (2020) for approximative statistical tests. Introduced from Fisher (1934), the Poisson dispersion index, also called the Fisher dispersion index, of a count random variable X on S=\{0,1,2,\ldots\}=:N_0 can be defined as

DI(X)=\frac{VarX}{EX},

the ratio of variance to mean. In fact, the positive quantity DI(X) is the ratio of two variances since EX is the expected variance under the Poisson distribution. Hence, one easily deduces the concept of the relative dispersion index (denoted by RDI) by choosing another reference than the Poisson distribution. Indeed, if X and Y are two count random variables on the same support S\subseteq N_0 such that EX=EY, then

RDI_Y(X):=\frac{VarX}{Var Y}=\frac{DI(X)}{DI(Y)} >=< 1;

i.e. X is over-, equi- and under-dispersed compared to Y if VarX > VarY, VarX = VarY and VarX < VarY, respectively.

For instance, the binomial dispersion index is defined as

RDI_B(X)=\frac{var X}{EX(1-EX/N)},

where N\in \{1,2,\ldots\} is the fixed number of trials. Also, the negative binomial dispersion index is defined as

RDI_NB(X)=\frac{varX}{EX(1+EX/ \lambda)},

where \lambda > 0 is the dispersion parameter. See also, Weiss (2018, page 15) and Abid et al. (2021) for more details.

The univariate variation index (VI) and its relative version with respect to inverse Gaussian distribution:

More recently, Abid et al. (2020) have introduced the exponential variation index for positive continuous random variable X on [0,\infty) as

VI(X)=\frac{VarX}{(EX)^2}.

It can be viewed as the squared coefficient of variation. It is used in the framework of reliability to discriminate distribution of increasing/decreasing failure rate on the average (IFRA/DFRA); see, e.g., Barlow and Proschan (1981) in the sense of the coefficient of variation. See also Touré et al. (2020) for more details. Following RDI, the relative variation index (RVI) is defined, for two continuous random variables X and Y on the same support S = [0,\infty) with EX = EY, by

RVI_Y(X):=\frac{VarX}{VarY}=\frac{VI(X)}{VI(Y)} >=< 1;

i.e. X is over-, equi- and under-varied compared to Y if VarX > VarY, VarX = VarY and VarX < VarY, respectively. For instance, the inverse Gaussian variation index is defined as

RVI_IG(X)=\lambda^2\frac{var X}{(EX)^3},

where \lambda > 0 is the shape parameter.

Next, consider the following notations. Let Y = (Y_1,\ldots,Y_k)^{\top} be a nondegenerate count or continuous k-variate random vector, k\ge 1. Let also EY be the mean vector of Y and covY= (cov(Y_i,Y_j) )_{i,j\in \{1,\ldots,k\}} the covariance matrix of Y.

The generalized dispersion index (GDI) and marginal dispersion index (MVI):

Kokonendji and Puig (2018) have introduced the generalized dispersion index for count vector Y on \{0,1,2,\ldots\}^k by

GDI(Y) =\frac{\sqrt{EY}^{\top} ( covY)\sqrt{EY}}{EY^{\top}EY}.

Note that when k=1, GDI(Y) is just the classical Fisher dispersion index DI. GDI(Y) makes it possible to compare the full variability of Y (in the numerator) with respect to its expected uncorrelated Poissonian variability (in the denominator) which depends only on EY. GDI(Y) takes into account the correlation between variables. For only taking into account the dispersion information coming from the margins, the authors defined the "marginal dispersion index":

MDI(Y) = \frac{\sqrt{EY}^{\top}( diag varY )\sqrt{EY}}{EY^{\top}EY}=\sum_{j=1}^k\frac{\{E(Y_j)\}^2}{EY^{\top}EY} DI(Y_j).

The generalized variation index (GVI) and marginal variation index (MVI):

Similarly, Kokonendji et al. (2020) defined the generalized variation index for positive continuous vector Y on [0, \infty)^k by

GVI(Y) =\frac{EY^{\top} ( covY) EY}{(EY^{\top}EY)^2}.

Remark that when k=1, GVI(Y) is the univariate variation index VI. GVI(Y) makes it possible to compare the full variability of Y (in the numerator) with respect to its expected uncorrelated exponential variability (in the denominator) which depends only on EY. Also, GVI(Y) takes into account the correlation between variables. To only take into account the variation information coming from the margins, Kokonendji et al. (2020) defined the "marginal variation index":

MVI(Y) = \frac{EY^{\top}( diag varY )EY}{(EY^{\top}EY)^2}=\sum_{j=1}^k\frac{(EY_j)^4}{(EY^{\top}EY)^2} VI(Y_j).

Author(s)

Aboubacar Y. Touré and Célestin C. Kokonendji

Maintainer: Aboubacar Y. Touré <aboubacaryacoubatoure.ussgb@gmail.com>

References

Abid, R., Kokonendji, C.C. and Masmoudi, A. (2020). Geometric Tweedie regression models for continuous and semicontinuous data with variation phenomenon, AStA Advances in Statistical Analysis 104, 33-58.

Abid, R.,Kokonendji, C.C. and Masmoudi, A. (2021). On Poisson-exponential-Tweedie models for ultra-overdispersed count data, AStA Advances in Statistical Analysis 105, 1-23.

Barlow, R.A. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing : Probability Models, Silver Springs, Maryland.

Fisher, R.A. (1934). The effects of methods of ascertainment upon the estimation of frequencies, Annals of Eugenics 6, 13-25.

Kokonendji, C.C., Over- and underdispersion models. In: N. Balakrishnan (Ed.) The Wiley Encyclopedia of Clinical Trials- Methods and Applications of Statistics in Clinical Trials, Vol.2 (Chap.30), pp. 506-526. Wiley, New York (2014).

Kokonendji, C.C. and Puig, P. (2018). Fisher dispersion index for multivariate count distributions : A review and a new proposal, Journal of Multivariate Analysis 165, 180-193.

Kokonendji, C.C., Touré, A.Y. and Sawadogo, A. (2020). Relative variation indexes for multivariate continuous distributions on [0,\infty)^k and extensions, AStA Advances in Statistical Analysis 104, 285-307.

Mizère, D., Kokonendji, C.C. and Dossou-Gbété, S. (2006). Quelques tests de la loi de Poisson contre des alternatives géenérales basées sur l'indice de dispersion de Fisher, Revue de Statistique Appliquée 54, 61-84.

Touré, A.Y., Dossou-Gbété, S. and Kokonendji, C.C. (2020). Asymptotic normality of the test statistics for relative dispersion and relative variation indexes, Journal of Applied Statistics 47, 2479-2491.

Weiss, C.H. (2018). An Introduction to Discrete-Valued Times Series. Wiley, Hoboken NJ.

[Package GWI version 1.0.2 Index]