dfba_gamma {DFBA} | R Documentation |
Goodman-Kruskal Gamma
Description
Given bivariate data in the form of either a rank-ordered table or a matrix, returns the number of concordant and discordant changes between the variates, the Goodman-Kruskal gamma statistic, and a Bayesian analysis of the population concordance proportion parameter phi.
Usage
dfba_gamma(x, a0 = 1, b0 = 1, prob_interval = 0.95)
Arguments
x |
Cross-tabulated matrix or table where cell [I, J] represents the frequency of observations where the rank of measure 1 is I and the rank of measure 2 is J. |
a0 |
First shape parameter for the prior beta distribution (default is 1) |
b0 |
Second shape parameter for the prior beta distribution (default is 1) |
prob_interval |
Desired width for interval estimates (default is 0.95) |
Details
For bivariate data where two measures are restricted on an ordinal scale,
such as when the two variates are ranked data over a limited set of integers,
then an ordered contingency table is often a convenient data representation.
For such a case the element in the cell of the matrix is the
frequency of occasions where one variate has a rank value of
and the
corresponding rank for the other variate is
. This situation is a
special case of the more general case where there are two continuous
bivariate measures. For the special case of a rank-order matrix with
frequencies, there is a distribution-free concordance correlation that is in
common usage: Goodman and Kruskal's gamma
(Siegel & Castellan, 1988).
Chechile (2020) showed that Goodman and Kruskal's gamma is equivalent to the
more general nonparametric correlation coefficient.
Historically, gamma was considered a different metric from
because
typically the version of
in standard use was
, which
is a flawed metric because it does not properly correct for ties. Note:
cor(... ,method = "kendall")
returns the correlation, which
is incorrect when there are ties. The correct
is computed by the
dfba_bivariate_concordance()
function.
The gamma statistic is equal to , where
is
the number of occasions when the variates change in a concordant way and
is the number of occasions when the variates change in a discordant fashion.
The value of
for an order matrix is the sum of terms for each
that are equal to
, where
is the frequency
for cell
and
is the sum of a frequencies in the
matrix where the row value is greater than
and where the column value is
greater than
. The value
is the sum of terms for each
that
are
, where
is the sum of the frequencies
in the matrix where row value is greater than
and the column value is
less than
. The
and
values computed in this fashion
are, respectively, equal to
and
values found when the bivariate
measures are entered as paired vectors into the
dfba_bivariate_concordance()
function.
As with the dfba_bivariate_concordance()
function, the Bayesian analysis focuses on the
population concordance proportion phi ; and
. The
likelihood function is proportional to
. The
prior distribution is a beta function, and the posterior distribution is the
conjugate beta where
a = a0 + nc
and
b = b0 + nd
.
Value
A list containing the following components:
gamma |
Sample Goodman-Kruskal gamma statistic; equivalent to the sample rank correlation coefficient tau_A |
a0 |
First shape parameter for prior beta |
b0 |
Second shape parameter for prior beta |
sample_p |
Sample estimate for proportion concordance |
nc |
Number of concordant comparisons between the paired measures |
nd |
Number of discordant comparisons between the paired measures |
a_post |
First shape parameter for the posterior beta distribution for the phi parameter |
b_post |
Second shape parameter for the posterior beta distribution for the phi parameter |
post_median |
Median of the posterior distribution for the phi concordance parameter |
prob_interval |
The probability of the interval estimate for the phi parameter |
eti_lower |
Lower limit of the posterior equal-tail interval for the phi parameter where the width of the interval is specified by the |
eti_upper |
Upper limit of the posterior equal-tail interval for the phi parameter where the width of the interval is specified by the |
References
Chechile, R.A. (2020). Bayesian Statistics for Experimental Scientists: A General Introduction Using Distribution-Free Methods. Cambridge: MIT Press.
Siegel, S., & Castellan, N. J. (1988) Nonparametric Statistics for the Behavioral Sciences. New York: McGraw Hill.
See Also
dfba_bivariate_concordance
for a more extensive discussion about the
statistic and the flawed
correlation
Examples
# Example with matrix input
N <- matrix(c(38, 4, 5, 0, 6, 40, 1, 2, 4, 8, 20, 30),
ncol = 4,
byrow = TRUE)
colnames(N) <- c('C1', 'C2', 'C3', 'C4')
rownames(N) <- c('R1', 'R2', 'R3')
dfba_gamma(N)
# Sample problem with table input
NTable <- as.table(N)
dfba_gamma(NTable)