catalanAlleles {polysat} | R Documentation |
Sort Alleles into Isoloci
Description
catalanAlleles
uses genotypes present in a "genambig"
object to sort alleles from one locus into two or more isoloci in an
allopolyploid or diploidized autopolyploid species. Alleles are
determined to belong to different
isoloci if they are both present in a fully homozygous genotype. If
necessary, heterozygous genotypes are also examined to resolve remaining
alleles.
Usage
catalanAlleles(object, samples = Samples(object), locus = 1,
n.subgen = 2, SGploidy = 2, verbose = FALSE)
Arguments
object |
A |
samples |
Optional argument indicating samples to be analyzed. Can be integer or character, as with other polysat functions. |
locus |
An integer or character string indicating which locus to analyze. Cannot be a vector greater than length 1. (The function will only analyze one locus at a time.) |
n.subgen |
The number of isoloci (number of subgenomes). For example, |
SGploidy |
The ploidy of each genome. Only one value is allowed; all genomes
must be the same ploidy. |
verbose |
Boolean. Indicates whether results, and if applicable, problematic genotypes, should be printed to the console. |
Details
catalanAlleles
implements and extends an approach used by Catalan
et al. (2006) that sorts alleles from a duplicated microsatellite
locus into two or more isoloci (homeologous loci on different
subgenomes). First, fully homozygous genotypes are identified and used
in analysis. If a genotype has as many alleles as there are subgenomes
(for example, a genotype with two alleles in an allotetraploid species),
it is assumed to be fully homozygous and the alleles are assumed to
belong to different subgenomes. If some alleles remain unassigned after
examination of all fully homozygous genotypes, heterozygous genotypes
are also examined to attempt to assign those remaining alleles.
For example, in an allotetraploid, if a genotype contains one unassigned allele, and all other alleles in the genotype are known to belong to one isolocus, the unassigned allele can be assigned to the other isolocus. Or, if two alleles in a genotype belong to one isolocus, one allele belongs to the other isolocus, and one allele is unassigned, the unassigned allele can be assigned to the latter isolocus. The function follows such logic (which can be extended to higher ploidies) until all alleles can be assigned, or returns a text string saying that the allele assignments were unresolvable.
It is important to note that this method assumes no null alleles and no homoplasy across isoloci. If the function encounters evidence of either it will not return allele assignments. Null alleles and homoplasy are real possibilities in any dataset, which means that this method simply will not work for some microsatellite loci.
(Null alleles are those that do not produce a PCR amplicon, usually because of a mutation in the primer binding site. Alleles that exhibit homoplasy are those that produce amplicons of the same size, despite not being identical by descent. Specifically, homoplasy between alleles from different isoloci will interfere with the Catalan method of allele assignment.)
Value
A list containing the following items:
locus |
A character string giving the name of the locus. |
SGploidy |
A number giving the ploidy of each subgenome.
Identical to the |
assignments |
If assignments cannot be made, a character string
describing the problem. Otherwise, a matrix with |
Note
Aside from homoplasy and null alleles, stochastic effects may prevent the minimum combination of genotypes needed to resolve all alleles from being present in the dataset. For a typical allotetraploid dataset, 50 to 100 samples will be needed, whereas an allohexaploid dataset may require over 100 samples. In simulations, allo-octoploid datasets with two tetraploid genomes were unresolvable even with 10,000 samples due to the low probability of finding full homozygotes. Additionally, loci are less likely to be resolvable if they have many alleles or if one isolocus is monomorphic.
Although determination of allele copy number by is not needed (or
expected) for catalanAlleles
as it was in the originally
published Catalan method, it is still very important that the
genotypes be high quality. Even a single scoring error can cause the
method to fail, including allelic dropout, contamination between
samples, stutter peaks miscalled as alleles, and PCR artifacts
miscalled as alleles. Poor quality loci (those that require some
“artistic” interpretation of gels or electropherograms) are
unlikely to work with this method. Individual genotypes that are of
questionable quality should be discarded before running the function.
Author(s)
Lindsay V. Clark
References
Catalan, P., Segarra-Moragues, J. G., Palop-Esteban, M., Moreno, C. and Gonzalez-Candelas, F. (2006) A Bayesian approach for discriminating among alternative inheritance hypotheses in plant polyploids: the allotetraploid origin of genus Borderea (Dioscoreaceae). Genetics 172, 1939–1953.
See Also
alleleCorrelations
, mergeAlleleAssignments
,
recodeAllopoly
, simAllopoly
Examples
# make the default simulated allotetraploid dataset
mydata <- simAllopoly()
# resolve the alleles
myassign <- catalanAlleles(mydata)