chi2x3way {chi2x3way}R Documentation

Chi-square and Marcotorchino's index for three-way contingency tables

Description

It performs
1) the computation of the Pearson's index and its partitioning for three-way contingency tables under two Scenarios. When the input parameter scen==1 then the theoretical probabilities are prescribed by the analyst (by default they are set homogeneous). When the input parameter scen==2 then the theoretical probabilities are estimated from the data.
2) the computation of the Marcotorchino's index and its partitioning for three-way contingency tables under the two Scenarios. When the input parameter scen==1 then the theoretical probabilities are prescribed by the analyst (by default they are set homogeneous). When the input parameter scen==2 then the theoretical probabilities are estimated from the data. In order to check the distribution of the Marcotorchino's index under the two Scenarios, it is possible to look at the results of a simulation study setting the input parameter simulation=TRUE.

Usage

chi2x3way(X,  indextype = "chi2", scen = 2, simulation = FALSE,
nboots = 1000, nran = 1000,
pi = rep(1/dim(X)[[1]],dim(X)[[1]]),
pj = rep(1/dim(X)[[2]],dim(X)[[2]]),pk = rep(1/dim(X)[[3]],dim(X)[[3]]), digits = 3)

Arguments

X

The three-way contingency table.

indextype

The input parameter for specifying what index should be considered. By default, the partition of the classical three-way Pearson index indextype = "chi2" is selected. The analyst can also partition Marcotorchino's index by defining the input parameter indextype = "tauM".

scen

The input parameter for specifying what scenario should be considered. By default, scen = 1, so that the probabilities are defined as being fixed and homogeneous among the categories (i.e. Scenario 1). When scen = 2, the expected frequencies are set to be equal to the observed marginal frequencies (i.e. Scenario 2).

simulation

A flag parameter, simulation, is included for specifying whether simulations are included as part of the analysis. When simulation = TRUE, three-way contingency tables are randomly generated under the different scenarios specified by scen. Note that for investigating the index distributions, a randomly generated contingency table which consists of at least one cell frequency that is less than five is automatically discarded. When simulation = TRUE, the distribution of the terms from the partition of the classic $C_M$-statistic, associated with Marcotorchino's index, the revised $C^S_M$-statistic and Pearson's chi-squared index are graphically depicted and compared using QQ-plots. By default, simulation = FALSE.

nboots

The input parameter for specifying the number of random three-way contingency tables to be generated when simulation = TRUE. By default, nboots = 1000.

nran

The input parameter for specifying the total number of samples of each randomly generated contingency table when simulation = TRUE. By default, nran = 1000.

pi

The input parameter pi specifies the probabilities assigned to the row categories. When scen = 1, they can be arbitrarily defined by the analyst. By default, the parameter is set to reflect homogeneous marginal (uniform) probabilities so that pi = rep(1/dim(X)[[1]], dim(X)[[1]]). When scen = 2 the hypothesized probabilities cannot be prescribed by the analyst and are set equal to the observed row margins of the three-way table.

pj

The input parameter pj specifies the probabilities assigned to the column categories. When scen = 1, they can be arbitrarily defined by the analyst. By default, the parameter is set to reflect homogeneous marginal (uniform) probabilities so that pi = rep(1/dim(X)[[2]], dim(X)[[2]]). When scen = 2 the hypothesized probabilities cannot be prescribed by the analyst and are set equal to the observed column margins of the three-way table.

pk

The input parameter pk specifies the probabilities assigned to the tube categories. When scen = 1, they can be arbitrarily defined by the analyst. By default, the parameter is set to reflect homogeneous marginal (uniform) probabilities so that pi = rep(1/dim(X)[[3]], dim(X)[[3]]). When scen = 2 the hypothesized probabilities are set equal to the observed tube margins of the three-way table.

digits

The minimum number of decimal places used for displaying the numerical summaries of the analysis is set by the parameter digits.

By default, digits = 3.

Value

X

The three-way contingency table of dimension IxJxK.

indexparts

The three-way index partition indexparts. When indextype = "chi2", this output gives the chi-squared partition, while indextype = "tauM" returns the partition of Marcotorchino's index, $tau_M$ and its related $C_M$-statistics. Further, it also returns the percentage of explained inertia, the degrees of freedom and the p-value of each term of the partition.

simulaout

When the input parameter simulation = TRUE, the output includes the object simulaout which returns nboot number of randomly generated three-way contingency tables. The output also includes the row, column and tube hypothesized probabilities pi, pj, and pk, and their observed marginal frequencies defined by the object name margI, margJ and margK, respectively. Furthermore, the output includes the empirical distribution of each term of the partition of the $C_M$-statistic, chi-squared statistic and $C^S_M$-statistic based on the nboots randomly generated contingency tables.

When simulation = FALSE, then simulaout = NULL.

Note

This function recalls internally many other functions, depending on the setting of the input parameter indexype. It recalls one of the four functions which does a partition under two different Scenarios. These two Scenarios depend on the theoretical probabilities: 1) the theoretical probabilities can be prescribed by the analysy. By default, when scen = 1, they are set all equal (homogeneity margins); 2) when scen = 2, the theoretical probabilities are estimated from the data. After performing a partition, it gives the output object necessary for printing the results. The print function is print.Chi2for3way . This function belongs to the class chi3class.

Author(s)

Lombardo R, Takane Y and Beh EJ

References

Beh EJ and Lombardo R (2014) Correspondence Analysis: Theory, Practice and New Strategies. John Wiley & Sons.
Carlier A Kroonenberg PM (1996) Biplots and decompositions in two-way and three-way correspondence analysis. Psychometrika, 61, 355-373.
Lancaster H O (1951) Complex contingency tables treated by the partition of the chi-square. Journal of Royal Statistical Society, Series B, 13, 242-249.
Loisel S and Takane Y (2016) Partitions of Pearson's chi-square ststistic for frequency tables: A comprehensive account. Computational Statistics, 31, 1429-1452.
Lombardo R Carlier A D'Ambra L (1996) Nonsymmetric correspondence analysis for three-way contingency tables. Methodologica, 4, 59-80.

Examples

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
## The function is currently defined as
data(olive)
chi2x3way(olive, scen = 2, indextype = "tauM", simulation = FALSE, nboots = 100, nran = 1000,
pi = rep(1/dim(olive)[[1]],dim(olive)[[1]]), pj = rep(1/dim(olive)[[2]],dim(olive)[[2]]),
pk = rep(1/dim(olive)[[3]],dim(olive)[[3]]), digits = 3)

[Package chi2x3way version 1.1 Index]