gff2fasta {microseq} | R Documentation |
Retrieving annotated sequences
Description
Retrieving from a genome the sequences specified in a gff.table
.
Usage
gff2fasta(gff.table, genome)
Arguments
gff.table |
A |
genome |
A fasta object ( |
Details
Each row in gff.table
(see readGFF
) describes a genomic feature
in the genome
, which is a tibble
with columns ‘Header’ and
‘Sequence’. The information in the columns Seqid, Start, End and Strand are used to
retrieve the sequences from the ‘Sequence’ column of genome
. Every Seqid in
the gff.table
must match the first token in one of the ‘Header’ texts, in
order to retrieve from the correct ‘Sequence’.
Value
A fasta object with one row for each row in gff.table
.
The Header
for each sequence is a summary of the information in the
corresponding row of gff.table
.
Author(s)
Lars Snipen and Kristian Hovde Liland.
See Also
Examples
# Using two files in this package
gff.file <- file.path(path.package("microseq"),"extdata","small.gff")
genome.file <- file.path(path.package("microseq"),"extdata","small.fna")
# Reading the genome first
genome <- readFasta(genome.file)
# Retrieving sequences
gff.table <- readGFF(gff.file)
fa.tbl <- gff2fasta(gff.table, genome)
# Alternative, using piping
readGFF(gff.file) %>% gff2fasta(genome) -> fa.tbl