diversity.evaluate.core {EvaluateCore}R Documentation

Diversity Indices

Description

Compute the following diversity indices and perform corresponding statistical tests to compare the phenotypic diversity for qualitative traits between entire collection (EC) and core set (CS).

Usage

diversity.evaluate.core(data, names, qualitative, selected, base = 2, R = 1000)

Arguments

data

The data as a data frame object. The data frame should possess one row per individual and columns with the individual names and multiple trait/character data.

names

Name of column with the individual names as a character string

qualitative

Name of columns with the qualitative traits as a character vector.

selected

Character vector with the names of individuals selected in core collection and present in the names column.

base

The logarithm base to be used for computation of Shannon-Weaver Diversity Index (\(I\)). Default is 2.

R

The number of bootstrap replicates. Default is 1000.

Value

A list with three data frames as follows.

simpson
Trait

The qualitative trait.

EC_No.Classes

The number of classes in the trait for EC.

CS_No.Classes

The number of classes in the trait for CS.

EC_d

The Simpson's Index (\(d\)) for EC.

EC_D

The Simpson's Index of Diversity (\(D\)) for EC.

EC_D.max

The Maximum Simpson's Index of Diversity (\(D_{max}\)) for EC.

EC_D.inv

The Simpson's Reciprocal Index (\(D_{R}\)) for EC.

EC_D.rel

The Relative Reciprocal Index (\(D'\)) for EC.

EC_d.V

The variance of \(d\) for EC according to (Simpson 1949).

EC_d.boot.V

The bootstrap variance of \(d\) for EC.

CS_d

The Simpson's Index (\(d\)) for CS.

CS_D

The Simpson's Index of Diversity (\(D\)) for CS.

CS_D.max

The Maximum Simpson's Index of Diversity (\(D_{max}\)) for CS.

CS_D.inv

The Simpson's Reciprocal Index (\(D_{R}\)) for CS.

CS_D.rel

The Relative Reciprocal Index (\(D'\)) for CS.

CS_d.V

The variance of \(d\) for CS according to (Simpson 1949).

CS_d.boot.V

The bootstrap variance of \(d\) for CS.

d.t.df

The degrees of freedom for t test.

d.t.stat

The t statistic.

d.t.pvalue

The p value for t test.

d.t.significance

The significance of t test for t-test

d.boot.z.df

The degrees of freedom for bootstrap z score.

d.boot.z.stat

The bootstrap z score.

d.boot.z.pvalue

The p value of z score.

d.boot.z.significance

The significance of z score.

shannon
Trait

The qualitative trait.

EC_No.Classes

The number of classes in the trait for EC.

CS_No.Classes

The number of classes in the trait for CS.

EC_I

The Shannon-Weaver Diversity Index (\(I\)) for EC.

EC_I.max

The Maximum Shannon-Weaver Diversity Index (\(I_{max}\)) for EC.

EC_I.rel

The Relative Shannon-Weaver Diversity Index (\(I'\)) for EC.

EC_I.V

The variance of \(I\) for EC according to (Hutcheson 1970).

EC_I.boot.V

The bootstrap variance of \(I\) for EC.

CS_I

The Shannon-Weaver Diversity Index (\(I\)) for CS.

CS_I.max

The Maximum Shannon-Weaver Diversity Index (\(I_{max}\)) for CS.

CS_I.rel

The Relative Shannon-Weaver Diversity Index (\(I'\)) for CS.

CS_I.V

The variance of \(I\) for CS according to (Hutcheson 1970).

CS_I.boot.V

The bootstrap variance of \(I\) for CS.

I.t.stat

The t statistic.

I.t.df

The degrees of freedom for t test.

I.t.pvalue

The p value for t test.

I.t.significance

The significance of t test for t-test

I.boot.z.df

The degrees of freedom for bootstrap z score.

I.boot.z.stat

The bootstrap z score.

I.boot.z.pvalue

The p value of z score.

I.boot.z.significance

The significance of z score.

mcintosh
EC_No.Classes

The number of classes in the trait for EC.

CS_No.Classes

The number of classes in the trait for CS.

EC_D.Mc

The McIntosh Index (\(D_{Mc}\)) for EC.

CS_D.Mc

The McIntosh Index (\(D_{Mc}\)) for CS.

M.boot.z.stat

The bootstrap z score.

M.boot.z.df

The degrees of freedom for bootstrap z score.

M.boot.z.pvalue

The p value of z score.

M.boot.z.significance

The significance of z score.

Details

The diversity indices and the corresponding statistical tests implemented in diversity.evaluate.core are as follows.

Simpson's and related indices

Simpson's index (\(d\)) which estimates the probability that two accessions randomly selected will belong to the same phenotypic class of a trait, is computed as follows (Simpson 1949; Peet 1974).

\[d = \sum_{i = 1}^{k}p_{i}^{2}\]

Where, \(p_{i}\) denotes the proportion/fraction/frequency of accessions in the \(i\)th phenotypic class for a trait and \(k\) is the number of phenotypic classes for the trait.

The value of \(d\) can range from 0 to 1 with 0 representing maximum diversity and 1, no diversity.

\(d\) is subtracted from 1 to give Simpson's index of diversity (\(D\)) (Greenberg 1956; Berger and Parker 1970; Peet 1974; Hennink and Zeven 1990) originally suggested by Gini (1912, 1912) and described in literature as Gini's diversity index or Gini-Simpson index. It is the same as Nei's diversity index or Nei's variation index (Nei 1973; Hennink and Zeven 1990). Greater the value of \(D\), greater the diversity with a range from 0 to 1.

\[D = 1 - d\]

The maximum value of \(D\), \(D_{max}\) occurs when accessions are uniformly distributed across the phenotypic classes and is computed as follows (Hennink and Zeven 1990).

\[D_{max} = 1 - \frac{1}{k}\]

Reciprocal of \(d\) gives the Simpson's reciprocal index (\(D_{R}\)) (Williams 1964; Hennink and Zeven 1990) and can range from 1 to \(k\). This was also described in Hill (1973) as (\(N_{2}\)).

\[D_{R} = \frac{1}{d}\]

Relative Simpson's index of diversity or Relative Nei's diversity/variation index (\(H'\)) (Hennink and Zeven 1990) is defined as follows (Peet 1974).

\[D' = \frac{D}{D_{max}}\]

Differences in Simpson's diversity index for qualitative traits of EC and CS can be tested by a t-test using the associated variance estimate described in Simpson (1949) (Lyons and Hutcheson 1978).

The t statistic is computed as follows.

\[t = \frac{d_{EC} - d_{CS}}{\sqrt{V_{d_{EC}} + V_{d_{CS}}}}\]

Where, the variance of \(d\) (\(V_{d}\)) is,

\[V_{d} = \frac{4N(N-1)(N-2)\sum_{i=1}^{k}(p_{i})^{3} + 2N(N-1)\sum_{i=1}^{k}(p_{i})^{2} - 2N(N-1)(2N-3) \left( \sum_{i=1}^{k}(p_{i})^{2} \right)^{2}}{[N(N-1)]^{2}}\]

The associated degrees of freedom is computed as follows.

\[df = (k_{EC} - 1) + (k_{CS} - 1)\]

Where, \(k_{EC}\) and \(k_{CS}\) are the number of phenotypic classes in the trait for EC and CS respectively.

Shannon-Weaver and related indices

An index of information \(H\), was described by Shannon and Weaver (1949) as follows.

\[H = -\sum_{i=1}^{k}p_{i} \log_{2}(p_{i})\]

\(H\) is described as Shannon or Shannon-Weaver or Shannon-Weiner diversity index in literature.

Alternatively, \(H\) is also computed using natural logarithm instead of logarithm to base 2.

\[H = -\sum_{i=1}^{k}p_{i} \ln(p_{i})\]

The maximum value of \(H\) (\(H_{max}\)) is \(\ln(k)\). This value occurs when each phenotypic class for a trait has the same proportion of accessions.

\[H_{max} = \log_{2}(k)\;\; \textrm{OR} \;\; H_{max} = \ln(k)\]

The relative Shannon-Weaver diversity index or Shannon equitability index (\(H'\)) is the Shannon diversity index (\(I\)) divided by the maximum diversity (\(H_{max}\)).

\[H' = \frac{H}{H_{max}}\]

Differences in Shannon-Weaver diversity index for qualitative traits of EC and CS can be tested by Hutcheson t-test (Hutcheson 1970).

The Hutcheson t statistic is computed as follows.

\[t = \frac{H_{EC} - H_{CS}}{\sqrt{V_{H_{EC}} + V_{H_{CS}}}}\]

Where, the variance of \(H\) (\(V_{H}\)) is,

\[V_{H} = \frac{\sum_{i=1}^{k}n_{i}(\log_{2}{n_{i}})^{2} \frac{(\sum_{i=1}^{k}\log_{2}{n_{i}})^2}{N}}{N^{2}}\] \[\textrm{OR}\] \[V_{H} = \frac{\sum_{i=1}^{k}n_{i}(\ln{n_{i}})^{2} \frac{(\sum_{i=1}^{k}\ln{n_{i}})^2}{N}}{N^{2}}\]

The associated degrees of freedom is approximated as follows.

\[df = \frac{(V_{H_{EC}} + V_{H_{CS}})^{2}}{\frac{V_{H_{EC}}^{2}}{N_{EC}} + \frac{V_{H_{CS}}^{2}}{N_{CS}}}\]

McIntosh Diversity Index

A similar index of diversity was described by McIntosh (1967) as follows (\(D_{Mc}\)) (Peet 1974).

\[D_{Mc} = \frac{N - \sqrt{\sum_{i=1}^{k}n_{i}^2}}{N - \sqrt{N}}\]

Where, \(n_{i}\) denotes the number of accessions in the \(i\)th phenotypic class for a trait and \(N\) is the total number of accessions so that \(p_{i} = {n_{i}}/{N}\).

Testing for difference with bootstrapping

Bootstrap statistics are employed to test the difference between the Simpson, Shannon-Weaver and McIntosh indices for qualitative traits of EC and CS (Solow 1993).

If \(I_{EC}\) and \(I_{CS}\) are the diversity indices with the original number of accessions, then random samples of the same size as the original are repeatedly generated (with replacement) \(R\) times and the corresponding diversity index is computed for each sample.

\[I_{EC}^{*} = \lbrace H_{EC_{1}}, H_{EC_{}}, \cdots, H_{EC_{R}} \rbrace\] \[I_{CS}^{*} = \lbrace H_{CS_{1}}, H_{CS_{}}, \cdots, H_{CS_{R}} \rbrace\]

Then the bootstrap null sample \(I_{0}\) is computed as follows.

\[\Delta^{*} = I_{EC}^{*} - I_{CS}^{*}\] \[I_{0} = \Delta^{*} - \overline{\Delta^{*}}\]

Where, \(\overline{\Delta^{*}}\) is the mean of \(\Delta^{*}\).

Now the original difference in diversity indices (\(\Delta_{0} = I_{EC} - I_{CS}\)) is tested against mean of bootstrap null sample (\(I_{0}\)) by a z test. The z score test statistic is computed as follows.

\[z = \frac{\Delta_{0} - \overline{H_{0}}}{\sqrt{V_{H_{0}}}}\]

Where, \(\overline{H_{0}}\) and \(V_{H_{0}}\) are the mean and variance of the bootstrap null sample \(H_{0}\).

The corresponding degrees of freedom is estimated as follows.

\[df = (k_{EC} - 1) + (k_{CS} - 1)\]

References

Berger WH, Parker FL (1970). “Diversity of planktonic foraminifera in deep-sea sediments.” Science, 168(3937), 1345–1347.

Gini C (1912). Variabilita e Mutabilita. Contributo allo Studio delle Distribuzioni e delle Relazioni Statistiche. [Fasc. I.]. Tipogr. di P. Cuppini, Bologna.

Gini C (1912). “Variabilita e mutabilita.” In Pizetti E, Salvemini T (eds.), Memorie di Metodologica Statistica. Liberia Eredi Virgilio Veschi, Roma, Italy.

Greenberg JH (1956). “The measurement of linguistic diversity.” Language, 32(1), 109.

Hennink S, Zeven AC (1990). “The interpretation of Nei and Shannon-Weaver within population variation indices.” Euphytica, 51(3), 235–240.

Hill MO (1973). “Diversity and evenness: A unifying notation and its consequences.” Ecology, 54(2), 427–432.

Hutcheson K (1970). “A test for comparing diversities based on the Shannon formula.” Journal of Theoretical Biology, 29(1), 151–154.

Lyons NI, Hutcheson K (1978). “C20. Comparing diversities: Gini's index.” Journal of Statistical Computation and Simulation, 8(1), 75–78.

McIntosh RP (1967). “An index of diversity and the relation of certain concepts to diversity.” Ecology, 48(3), 392–404.

Nei M (1973). “Analysis of gene diversity in subdivided populations.” Proceedings of the National Academy of Sciences, 70(12), 3321–3323.

Peet RK (1974). “The measurement of species diversity.” Annual Review of Ecology and Systematics, 5(1), 285–307.

Shannon CE, Weaver W (1949). The Mathematical Theory of Communication, number v. 2 in The Mathematical Theory of Communication. University of Illinois Press.

Simpson EH (1949). “Measurement of diversity.” Nature, 163(4148), 688–688.

Solow AR (1993). “A simple test for change in community structure.” The Journal of Animal Ecology, 62(1), 191.

Williams CB (1964). Patterns in the Balance of Nature and Related Problems in Quantitative Ecology. Academic Press.

See Also

shannon, diversity, boot

Examples


data("cassava_CC")
data("cassava_EC")

ec <- cbind(genotypes = rownames(cassava_EC), cassava_EC)
ec$genotypes <- as.character(ec$genotypes)
rownames(ec) <- NULL

core <- rownames(cassava_CC)

quant <- c("NMSR", "TTRN", "TFWSR", "TTRW", "TFWSS", "TTSW", "TTPW", "AVPW",
           "ARSR", "SRDM")
qual <- c("CUAL", "LNGS", "PTLC", "DSTA", "LFRT", "LBTEF", "CBTR", "NMLB",
          "ANGB", "CUAL9M", "LVC9M", "TNPR9M", "PL9M", "STRP", "STRC",
          "PSTR")

ec[, qual] <- lapply(ec[, qual],
                     function(x) factor(as.factor(x)))


diversity.evaluate.core(data = ec, names = "genotypes",
                        qualitative = qual, selected = core)



[Package EvaluateCore version 0.1.3 Index]