import.msf {bios2mds} | R Documentation |
Reads a multiple sequence alignment file in MSF format
Description
Reads a Multiple Sequence Alignment (MSA) file in MSF format (.msf extension).
Usage
import.msf(file, aa.to.upper = TRUE, gap.to.dash = TRUE)
Arguments
file |
a string of characters to indicate the name of the MSA file to be read. |
aa.to.upper |
a logical value indicating whether amino acids should be converted to upper case (TRUE) or not (FALSE). Default is TRUE. |
gap.to.dash |
a logical value indicating whether the dot (.) and tilde ( |
Details
Initially, Multiple Sequence Format (MSF) was the multiple sequence alignment format of the Wisconsin Package (WP) or GCG (Genetic Computer Group). This package is a suite of over 130 sequence analysis programs for database searching, secondary structure prediction or sequence alignment. Presently, numerous multiple sequence alignment editors (Jalview and GeneDoc for example) can read and write MSF files.
MSF file displays several specificities:
a header containing sequence identifiers and characteristics (length, check and weight).
a separator symbolized by 2 slashes (//).
sequences of identifiers, displayed by consecutive blocks.
Value
A object of class 'align', which is a named list whose elements correspond to sequences, in the form of character vectors.
Note
import.msf
checks the presence of duplicated identifiers in header. Sequences whose
identifiers are missing in header are ignored.
Author(s)
Julien Pele
See Also
read.alignment
function from seqinr
package.
read.GDoc
function from aaMI
package (archived).
Examples
# reading of the multiple sequence alignment of human GPCRs in MSF format:
aln <- import.msf(system.file("msa/human_gpcr.msf", package = "bios2mds"))