synthetic.cross {ASRgenomics}R Documentation

Generates a molecular matrix M for hypothetical crosses based on the genomic information of the parents

Description

This function generates (or imputes) a molecular matrix for offspring from hypothetical crosses based on the genomic information from the parents. This is a common procedure in species such as maize, where only the parents (inbred lines) are genotyped, and this information is used to generate/impute the genotypic data of each of the hybrid offspring. This function can be also used for bulked DNA analyses, in order to obtain an bulked molecular matrix for full-sib individuals were only parents are genotyped.

Usage

synthetic.cross(
  M = NULL,
  ped = NULL,
  indiv = NULL,
  mother = NULL,
  father = NULL,
  heterozygote.action = c("useNA", "exact", "fail", "expected"),
  na.action = c("useNA", "expected"),
  message = TRUE
)

Arguments

M

A matrix with marker data of full form (n \times p), with n individuals (mothers and fathers) and p markers. Individual and marker names are assigned to rownames and colnames, respectively. Data in matrix is coded as 0, 1, 2 (integer or numeric) (default = NULL).

ped

A data frame with three columns containing only the pedigree of the hypothetical offspring. (not pedigree of parents) It should include the three columns for individual, mother and father (default = NULL).

indiv

A character indicating the column in ped data frame containing the identification of the offspring (default = NULL).

mother

A character indicating the column in ped data frame containing the identification of the mother (default = NULL).

father

A character indicating the column in ped data frame containing the identification of the father (default = NULL).

heterozygote.action

Indicates the action to take when heterozygotes are found in a marker. Options are: "useNA", "exact", "fail", and "expected". See details for more information (default = "useNA")

na.action

Indicates the action to take when missing values are found in a marker. Options are: "useNA" and "expected". See details for more information (default = "useNA").

message

If TRUE diagnostic messages are printed on screen (default = TRUE).

Details

For double-haploids, almost the totality of the markers (except for genotyping errors) will be homozygotic reads. But in many other cases (including recombinant inbred lines) there will be a proportion of heterozygotic reads. In these case, it is very difficult to infer (impute) the exact genotype of a given offspring individual. For example, if parents are 0 (AA) and 1 (AC) then offsprings will differ given this Mendelian sampling. However, different strategies exist to determine the expected value for that specific cross (if required), which are detailed below using the option heterozygote.action.

Missing value require special treatment, and an imputation strategy is detailed below as indicated using the option na.action.

Similarly, the calculation of the expected read of a cross when both parents are missing is also based on population allelic frequencies for the given marker. The expressions for expected values are detailed below.

            q^2 \times q^2 (probability that both parents are 0) x 0 (expected value of the offspring from a 0 x 0 cross: 0(1/1)) +

            2 \times (q^2 \times 2pq) (probability that the first parent is 0 and the second is 1; this requires the multiplication by 2 because it is also possible that the first parent is 1 and the second is 0) x 0.5 (offspring: 0(1/2) + 1(1/2)) +

            2 \times (q^2 \times p^2) (this could be 0 x 2 or 2 x 0) x 1 (offspring: 1(1/1)) +

            2pq \times 2pq (both parents are 1) x 1 (offspring: 0(1/4) + 1(1/2) + 2(1/4)) +

            2 \times (2pq \times q2) (this could be 1 x 2 or 2 x 1) x 1.5 (offspring: 1(1/2) + 2(1/2)) +

            p^2 \times p^2 (both parents are 2) x 2 (offspring: 2(1/1))

Note that the use of na.action = "expected" is recommended when a large number of offspring will conform the hybrid cross (such as with bulked DNA analyses) for family groups with reasonable number of individuals.

Warning. If "expected" is used for heterozygote.action or na.action, direct transformation of the molecular data to other codings (e.g., dominance matrix coded as c(0,1,0)) is not recommended.

Value

A molecular matrix \boldsymbol{M} containing the genotypes generated/imputed for the hypothetical cross.

Examples

# Create dummy pedigree (using first 10 as parents).
ped <- data.frame(
 male = rownames(geno.apple)[1:5],
 female = rownames(geno.apple)[6:10])
ped$offs <- paste(ped$male, ped$female, sep = "_")
ped

# Select portion of M for parents.
Mp <- geno.apple[c(ped$male, ped$female), 1:15]

# Get genotype of crosses removing markers with heterozygotes.
synthetic.cross(
 M = Mp, ped = ped,
 indiv = "offs", mother = "female", father = "male",
 heterozygote.action = "exact",
 na.action = "useNA")

# Request the synthetic cross to be NA in the respective samples.
synthetic.cross(
 M = Mp, ped = ped,
 indiv = "offs", mother = "female", father = "male",
 heterozygote.action = "useNA",
 na.action = "useNA")

# Get genotype of crosses and use expected values.
synthetic.cross(
 M = Mp, ped = ped,
 indiv = "offs", mother = "female", father = "male",
 heterozygote.action = "expected", na.action = "expected")


[Package ASRgenomics version 1.1.3 Index]