percentdiff.evaluate.core {EvaluateCore}R Documentation

Percentage Difference of Means and Variances

Description

Compute the following differences between the entire collection (EC) and core set (CS).

Usage

percentdiff.evaluate.core(data, names, quantitative, selected, alpha = 0.05)

Arguments

data

The data as a data frame object. The data frame should possess one row per individual and columns with the individual names and multiple trait/character data.

names

Name of column with the individual names as a character string

quantitative

Name of columns with the quantitative traits as a character vector.

selected

Character vector with the names of individuals selected in core collection and present in the names column.

alpha

Type I error probability (Significance level) of difference.

Details

The differences are computed as follows.

\[MD\%_{Hu} = \left ( \frac{S_{t}}{n} \right ) \times 100\]

Where, \(S_{t}\) is the number of traits with a significant difference between the means of the EC and the CS and \(n\) is the total number of traits. A representative core should have \(MD\%_{Hu}\) < 20 % and \(CR\) > 80 % (Hu et al. 2000).

\[VD\%_{Hu} = \left ( \frac{S_{F}}{n} \right ) \times 100\]

Where, \(S_{F}\) is the number of traits with a significant difference between the variances of the EC and the CS and \(n\) is the total number of traits. Larger \(VD\%_{Hu}\) value indicates a more diverse core set.

\[MD\%_{Kim} = \left ( \frac{1}{n}\sum_{i=1}^{n} \frac{\left | M_{EC_{i}}-M_{CS_{i}} \right |}{M_{CS_{i}}} \right ) \times 100\]

Where, \(M_{EC_{i}}\) is the mean of the EC for the \(i\)th trait, \(M_{CS_{i}}\) is the mean of the CS for the \(i\)th trait and \(n\) is the total number of traits.

\[VD\%_{Kim} = \left ( \frac{1}{n}\sum_{i=1}^{n} \frac{\left | V_{EC_{i}}-V_{CS_{i}} \right |}{V_{CS_{i}}} \right ) \times 100\]

Where, \(V_{EC_{i}}\) is the variance of the EC for the \(i\)th trait, \(V_{CS_{i}}\) is the variance of the CS for the \(i\)th trait and \(n\) is the total number of traits.

\[\overline{d}D\% = \frac{\overline{d}_{CS}-\overline{d}_{EC}}{\overline{d}_{EC}} \times 100\]

Where, \(\overline{d}_{CS}\) is the mean squared Euclidean distance among accessions in the CS and \(\overline{d}_{EC}\) is the mean squared Euclidean distance among accessions in the EC.

Value

A data frame with the values of \(MD\%_{Hu}\), \(VD\%_{Hu}\), \(MD\%_{Kim}\), \(VD\%_{Kim}\) and \(\overline{d}D\%\).

References

Hu J, Zhu J, Xu HM (2000). “Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops.” Theoretical and Applied Genetics, 101(1), 264–268.

Kim K, Chung H, Cho G, Ma K, Chandrabalan D, Gwag J, Kim T, Cho E, Park Y (2007). “PowerCore: A program applying the advanced M strategy with a heuristic search for establishing core sets.” Bioinformatics, 23(16), 2155–2162.

Studnicki M, Madry W, Schmidt J (2013). “Comparing the efficiency of sampling strategies to establish a representative in the phenotypic-based genetic diversity core collection of orchardgrass (Dactylis glomerata L.).” Czech Journal of Genetics and Plant Breeding, 49(1), 36–47.

See Also

snk.evaluate.core, snk.evaluate.core

Examples


data("cassava_CC")
data("cassava_EC")

ec <- cbind(genotypes = rownames(cassava_EC), cassava_EC)
ec$genotypes <- as.character(ec$genotypes)
rownames(ec) <- NULL

core <- rownames(cassava_CC)

quant <- c("NMSR", "TTRN", "TFWSR", "TTRW", "TFWSS", "TTSW", "TTPW", "AVPW",
           "ARSR", "SRDM")
qual <- c("CUAL", "LNGS", "PTLC", "DSTA", "LFRT", "LBTEF", "CBTR", "NMLB",
          "ANGB", "CUAL9M", "LVC9M", "TNPR9M", "PL9M", "STRP", "STRC",
          "PSTR")

ec[, qual] <- lapply(ec[, qual],
                     function(x) factor(as.factor(x)))

percentdiff.evaluate.core(data = ec, names = "genotypes",
                          quantitative = quant, selected = core)


[Package EvaluateCore version 0.1.3 Index]