RADdata2VCF {polyRAD} | R Documentation |
Export RADdata Genotypes to VCF
Description
Converts genotype calls from polyRAD into VCF format. The user may send
the results directly to a file, or to a
CollapsedVCF
for further manipulation.
Usage
RADdata2VCF(object, file = NULL, asSNPs = TRUE, hindhe = TRUE,
sampleinfo = data.frame(row.names = GetTaxa(object)),
contigs = data.frame(row.names = unique(object$locTable$Chr)))
Arguments
object |
A |
file |
An optional character string or connection indicating where to write the file. Append mode may be used with connections if multiple RADdata objects need to be written to one VCF. |
asSNPs |
Boolean indicating whether to convert haplotypes to individual SNPs and indels. |
hindhe |
Boolean indicating whether to export a mean value of Hind/He
(see |
sampleinfo |
A data frame with optional columns indicating any sample metadata to export to "SAMPLE" header lines. |
contigs |
A data frame with optional columns providing information about contigs to export to "contig" header lines. |
Details
Currently, the FORMAT fields exported are GT (genotype), AD (allelic read depth), and DP (read depth). Genotype posterior probabilities are not exported due to the mathematical intractability of converting pseudo-biallelic probabilities to multiallelic probabilities.
Genotypes exported to the GT field are obtained internally using
GetProbableGenotypes
.
INFO fields exported include the standard fields NS (number of samples with more than zero reads) and DP (total depth across samples) as well as the custom fields LU (index of the marker in the original RADdata object) and HH (Hind/He statistic for the marker).
This function requires the BioConductor package VariantAnnotation. See https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html for installation instructions.
Value
A CollapsedVCF
object.
Author(s)
Lindsay V. Clark
References
https://samtools.github.io/hts-specs/VCFv4.3.pdf
See Also
Examples
# Set up example dataset for export.
# You DO NOT need to adjust attr or locTable in your own dataset.
data(exampleRAD)
attr(exampleRAD$alleleNucleotides, "Variable_sites_only") <- FALSE
exampleRAD$locTable$Ref <-
exampleRAD$alleleNucleotides[match(1:nLoci(exampleRAD), exampleRAD$alleles2loc)]
exampleRAD <- IterateHWE(exampleRAD)
# An optional table of sample data
sampleinfo <- data.frame(row.names = GetTaxa(exampleRAD),
Population = rep(c("North", "South"), each = 50))
# Add contig information (fill in with actual data rather than random)
mycontigs <- data.frame(row.names = c("1", "4", "6", "9"), length = sample(1e8, 4),
URL = rep("ftp://mygenome.com/mygenome.fa", 4))
# Set up a file destination for this example
# (It is not necessary to use tempfile with your own data)
outfile <- tempfile(fileext = ".vcf")
# Export VCF
testvcf <- RADdata2VCF(exampleRAD, file = outfile, sampleinfo = sampleinfo,
contigs = mycontigs)