prepHaplotFiles {microhaplot} | R Documentation |
Extracts haplotype from alignment reads.
Description
The function microhaplot
extracts haplotype from sequence alignment files through perl script hapture
and returns a summary table of the read depth and read quality associate with haplotype.
Usage
prepHaplotFiles(run.label, sam.path, label.path, vcf.path,
out.path = tempdir(), add.filter = FALSE, app.path = tempdir(),
n.jobs = 1)
Arguments
run.label |
character vector. Run label to be used to display in haPLOType. Required |
sam.path |
string. Directory path folder containing all sequence alignment files (SAM). Required |
label.path |
string. Label file path. This customized label file is a tab-separate file that contains entries of SAM file name, individual ID, and group label. Required |
vcf.path |
string. VCF file path. Required |
out.path |
string. Optional. If not specified, the intermediate files are created under |
add.filter |
boolean. Optional. If true, this removes any haplotype with unknown and deletion alignment characters i.e. "*" and "_", removes any locus with large number of haplotypes ( # > 40) , and remove any locus with fewer than half of the total individuals. |
app.path |
string. Path to shiny haPLOType app. Optional. If not specified, the path is default to |
n.jobs |
positive integer. Number of SAM files to be parallel processed. Optional. This multithread is only available for non Window OS. Recommend two times the number of processors/core. |
Value
This function returns a dataframe of 9 columns i.e group, id, locus, haplotype, depth, sum of Phred score, max of Phred score, allele balance and haplotype rank from highest to lowest read depth. This dataframe will also be saved in out.path
.
Examples
run.label <- "sebastes"
sam.path <- tempdir()
untar(system.file("extdata",
"sebastes_sam.tar.gz",
package="microhaplot"),
exdir = sam.path)
label.path <- file.path(sam.path, "label.txt")
vcf.path <- file.path(sam.path, "sebastes.vcf")
mvShinyHaplot(tempdir())
app.path <- file.path(tempdir(), "microhaplot")
# retrieve system Perl version number
perl.version <- as.numeric(system('perl -e "print $];"', intern=TRUE))
if (perl.version >= 5.014) {
haplo.read.tbl <- prepHaplotFiles(run.label = run.label,
sam.path = sam.path,
out.path = tempdir(),
label.path = label.path,
vcf.path = vcf.path,
app.path = app.path)
}else {
message("Perl version is outdated. Must >= 5.014.")}