str {gquad} | R Documentation |
Predicting short tandem repeats
Description
This function predicts short tandem repeats in 'x' in nucleotides. Nucleotide sequence can be provided in raw or fasta format or as GenBank accession number(s). Internet is needed to connect to GenBank database, if accession number(s) is given as argument.
Usage
str(x, xformat = "default")
Arguments
x |
Nucleotide sequence(s) in raw format or a fasta file or a GenBank accession number(s); from which short tandem repeats will be predicted. If the fasta file name does not contain an absolute path, the file name is relative to the current working directory. |
xformat |
a character string specifying the format of x : default (raw), fasta, GenBank (GenBank accession number(s)). |
Details
This function predicts short tandem repeats in nucleotide sequences and provide the position, sequence and length of the predicted repeats, if any.
Value
A dataframe of short tandem repeats' position, sequence and length. If more than one DNA sequence is provided as argument, an input ID is returned for repeats predicted from each input sequence.
Author(s)
Hannah O. Ajoge
References
Paper on gquad and the web application (Non-B DNA Predictor) is under review, see draft in vignettes
Examples
## Predicting short tandem repeats from raw nucleotide sequences
E1 <- c("TCTACACACACACACACACACGAAT", "tagggugugugugugugugugugutcct")
str(E1)
## Predicting short tandem repeats from nucleotide sequences in fasta file
## Not run: str(x="Example.fasta", xformat = "fasta")
## Predicting short tandem repeats from nucleotide sequences,
## using GenBank accession numbers.
## Internet connectivity is needed for this to work.
## Not run: str(c("BH114913", "AY611035"), xformat = "GenBank")