ExportGAPIT {polyRAD} | R Documentation |
Export RADdata Object for Use by Other R Packages
Description
After a "RADdata"
object has been run through a pipeline such as
IteratePopStruct
, these functions can be used to export
the genotypes to R packages and other software that can
perform genome-wide association and genomic prediction. ExportGAPIT
,
Export_rrBLUP_Amat
, Export_rrBLUP_GWAS
, Export_GWASpoly
,
and Export_TASSEL_Numeric
all export continuous numerical genotypes
generated by GetWeightedMeanGenotypes
. Export_polymapR
,
Export_Structure
, and Export_adegenet_genind
use
GetProbableGenotypes
to export discrete
genotypes. Export_MAPpoly
and Export_polymapR_probs
export
genotype posterior probabilities.
Usage
ExportGAPIT(object, onePloidyPerAllele = FALSE)
Export_rrBLUP_Amat(object, naIfZeroReads = FALSE,
onePloidyPerAllele = FALSE)
Export_rrBLUP_GWAS(object, naIfZeroReads = FALSE,
onePloidyPerAllele = FALSE)
Export_TASSEL_Numeric(object, file, naIfZeroReads = FALSE,
onePloidyPerAllele = FALSE)
Export_polymapR(object, naIfZeroReads = TRUE,
progeny = GetTaxa(object)[!GetTaxa(object) %in%
c(GetDonorParent(object), GetRecurrentParent(object),
GetBlankTaxa(object))])
Export_polymapR_probs(object, maxPcutoff = 0.9,
correctParentalGenos = TRUE,
multiallelic = "correct")
Export_MAPpoly(object, file, pheno = NULL, ploidyIndex = 1,
progeny = GetTaxa(object)[!GetTaxa(object) %in%
c(GetDonorParent(object), GetRecurrentParent(object),
GetBlankTaxa(object))],
digits = 3)
Export_GWASpoly(object, file, naIfZeroReads = TRUE, postmean = TRUE, digits = 3,
splitByPloidy = TRUE)
Export_Structure(object, file, includeDistances = FALSE, extraCols = NULL,
missingIfZeroReads = TRUE)
Export_adegenet_genind(object, ploidyIndex = 1)
Arguments
object |
A |
onePloidyPerAllele |
Logical. If |
naIfZeroReads |
A logical indicating whether |
file |
A character string indicating a file path to which to write. |
pheno |
A data frame or matrix of phenotypic values, with progeny in rows and traits in columns. Columns should be named. |
ploidyIndex |
Index, within |
progeny |
A character vector indicating which individuals to export as progeny of the cross. |
maxPcutoff |
A cutoff for posterior probabilities, below which genotypes will be reported as 'NA' in the 'geno' column. |
correctParentalGenos |
Passed to |
multiallelic |
Passed to |
digits |
Number of decimal places to which to round genotype probabilities or posterior mean genotypes in the output file. |
postmean |
Logical. If |
splitByPloidy |
Logical. If |
includeDistances |
Logical. If |
extraCols |
An optional data frame, with one row per taxon, containing columns of data to output to the left of the genotypes in the Structure file. |
missingIfZeroReads |
See |
Details
GAPIT, FarmCPU, rrBLUP, TASSEL, and GWASpoly allow
genotypes to be a continuous numeric variable. MAPpoly and polymapR
allow for import of genotype probabilities.
GAPIT does not allow missing data, hence there is no naIfZeroReads
argument for ExportGAPIT
. Genotypes are exported on a scale of -1
to 1 for rrBLUP, on a scale of 0 to 2 for GAPIT and FarmCPU,
and on a scale of 0 to 1 for TASSEL.
For all functions except Export_Structure
and Export_adegenet_genind
,
one allele per marker is dropped. Export_MAPpoly
also drops alleles where one or both parental genotypes could not be determined,
and where both parents are homozygotes.
For ExportGAPIT
and Export_rrBLUP_GWAS
, chromosome and position are filled with dummy
data if they do not exist in object$locTable
. For Export_TASSEL_Numeric
,
allele names are exported, but no chromosome or position information per se.
If the chromosomes in object$locTable
are in character format,
ExportGAPIT
, Export_MAPpoly
, and Export_GWASpoly
will
attempt to extract chromosome numbers.
For polymapR there must only be one possible inheritance mode across loci
(one value in object$possiblePloidies
) in the RADdata
object, although
triploid F1 populations derived from diploid and tetraploid parents are allowed.
See SubsetByPloidy
for help reducing a RADdata
object to a
single inheritance mode.
MAPpoly only
allows one ploidy, but Export_MAPpoly
allows the user to select which
inheritance mode from the RADdata
object to use. (This is due to how internal
polyRAD functions are coded.)
Value
For ExportGAPIT
, a list:
GD |
A data frame with taxa in the first column and alleles (markers)
in subsequent columns, containing the genotypes. To be passed to the |
GM |
A data frame with the name, chromosome number, and position of
every allele (marker). To be passed to the |
For Export_rrBLUP_Amat
, a matrix with taxa in rows and alleles (markers)
in columns, containing genotype data. This can be passed to A.mat
in
rrBLUP.
For Export_rrBLUP_GWAS
, a data frame with alleles (markers) in rows.
The first three columns contain the marker names, chromosomes, and positions,
and the remaining columns each represent one taxon and contain the genotype
data. This can be passed to the GWAS
function in rrBLUP.
Export_TASSEL_Numeric
and Export_MAPpoly
write a file but does
not return an object.
For Export_polymapR
, a matrix of integers indicating the most probable
allele copy number, with markers in rows and individuals in columns. The
parents are listed first, followed by all progeny.
For Export_polymapR_probs
, a data frame suitable to pass to the
probgeno_df
argument of checkF1. Note that under
default parameters, in some cases the maxP
, maxgeno
, and
geno
columns may not actually reflect the maximum posterior probability
if genotype correction was performed.
For Export_adegenet_genind
, a "genind"
object.
Export_MAPpoly
, Export_GWASpoly
, and Export_Structure
write files but do not return
an object. Files output by Export_GWASpoly
are comma delimited and
in numeric format. Sample and locus names are included in the file output
by Export_Structure
, and the number of rows for each sample is
equal to the highest ploidy as determined by the taxaPloidy
slot and the
output of GetProbableGenotypes
.
Note
rrBLUP and polymapR are available through CRAN, and GAPIT and FarmCPU must be downloaded from the Zhang lab website. MAPpoly is available on GitHub but not yet on CRAN. GWASpoly is available from GitHub.
In my experience with TASSEL 5, numerical genotype files that are too large do
not load/display properly. If you run into this problem I recommend using
SplitByChromosome
to split your RADdata
object into
multiple smaller objects, which can then be exported to separate files using
Export_TASSEL_Numeric
. If performing GWAS, you may also need to compute
a kinship matrix using separate software such as rrBLUP.
Author(s)
Lindsay V. Clark
References
GAPIT and FarmCPU:
Lipka, A. E., Tian, F., Wang, Q., Peiffer, J., Li, M., Bradbury, P. J., Gore, M. A., Buckler, E. S. and Zhang, Z. (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397–2399.
Liu, X., Huang, M., Fan, B., Buckler, E. S., Zhang, Z. (2016) Iterative usage of fixed and random effects models for powerful and efficient genome-wide association studies. PLoS Genetics 12, e1005767.
rrBLUP:
Endelman, J.B. (2011) Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP. The Plant Genome 4, 250–255.
TASSEL:
https://www.maizegenetics.net/tassel
Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y. and Buckler, E. S. (2007) TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635.
polymapR:
Bourke, P., van Geest, G., Voorrips, R. E., Jansen, J., Kranenberg, T., Shahin, A., Visser, R. G. F., Arens, P., Smulders, M. J. M. and Maliepaard, C. (2018) polymapR: linkage analysis and genetic map construction from F1 populations of outcrossing polyploids. Bioinformatics 34, 3496–3502.
MAPpoly:
https://github.com/mmollina/MAPpoly
Mollinari, M. and Garcia, A. A. F. (2018) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models. bioRxiv doi: https://doi.org/10.1101/415232.
GWASpoly:
https://github.com/jendelman/GWASpoly
Rosyara, U. R., De Jong, W. S., Douches, D. S., and Endelman, J. B. (2016) Software for Genome-Wide Association Studies in Autopolyploids and Its Application to Potato. Plant Genome 9.
Structure:
https://web.stanford.edu/group/pritchardlab/structure.html
Hubisz, M. J., Falush, D., Stephens, M. and Pritchard, J. K. (2009) Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources 9, 1322–1332.
Falush, D., Stephens, M. and Pritchard, J. K. (2007) Inferences of population structure using multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes 7, 574–578
Falush, D., Stephens, M. and Pritchard, J. K. (2003) Inferences of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587.
Pritchard, J. K., Stephens, M. and Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945–959.
See Also
GetWeightedMeanGenotypes
, RADdata2VCF
Examples
# load example dataset
data(exampleRAD)
# get genotype posterior probabilities
exampleRAD <- IterateHWE(exampleRAD)
# export to GAPIT
exampleGAPIT <- ExportGAPIT(exampleRAD)
# export to rrBLUP
example_rrBLUP_A <- Export_rrBLUP_Amat(exampleRAD)
example_rrBLUP_GWAS <- Export_rrBLUP_GWAS(exampleRAD)
# export to TASSEL
outfile <- tempfile() # temporary file for example
Export_TASSEL_Numeric(exampleRAD, outfile)
# for mapping populations
data(exampleRAD_mapping)
# specify donor and recurrent parents
exampleRAD_mapping <- SetDonorParent(exampleRAD_mapping, "parent1")
exampleRAD_mapping <- SetRecurrentParent(exampleRAD_mapping, "parent2")
# run the pipeline
exampleRAD_mapping <- PipelineMapping2Parents(exampleRAD_mapping)
# convert to polymapR format
examplePMR <- Export_polymapR(exampleRAD_mapping)
examplePMR2 <- Export_polymapR_probs(exampleRAD_mapping)
# export to MAPpoly
outfile2 <- tempfile() # temporary file for example
# generate a dummy phenotype matrix containing random numbers
mypheno <- matrix(rnorm(200), nrow = 100, ncol = 2,
dimnames = list(GetTaxa(exampleRAD_mapping)[-(1:2)],
c("Height", "Yield")))
Export_MAPpoly(exampleRAD_mapping, file = outfile2, pheno = mypheno)
# load data into MAPpoly
# require(mappoly)
# mydata <- read_geno_prob(outfile2)
# export to GWASpoly
outfile3 <- tempfile() # temporary file for example
Export_GWASpoly(SubsetByPloidy(exampleRAD, list(2)), outfile3)
# export to Structure
outfile4 <- tempfile() # temporary file for example
Export_Structure(exampleRAD, outfile4)
# export to adegenet
if(requireNamespace("adegenet", quietly = TRUE)){
mygenind <- Export_adegenet_genind(exampleRAD)
}