import.msf {bios2mds}R Documentation

Reads a multiple sequence alignment file in MSF format


Reads a Multiple Sequence Alignment (MSA) file in MSF format (.msf extension).


import.msf(file, = TRUE, = TRUE)



a string of characters to indicate the name of the MSA file to be read.

a logical value indicating whether amino acids should be converted to upper case (TRUE) or not (FALSE). Default is TRUE.

a logical value indicating whether the dot (.) and tilde (\sim) gap symbols should be converted to the dash (-) character (TRUE) or not (FALSE). Default is TRUE.


Initially, Multiple Sequence Format (MSF) was the multiple sequence alignment format of the Wisconsin Package (WP) or GCG (Genetic Computer Group). This package is a suite of over 130 sequence analysis programs for database searching, secondary structure prediction or sequence alignment. Presently, numerous multiple sequence alignment editors (Jalview and GeneDoc for example) can read and write MSF files.

MSF file displays several specificities:


A object of class 'align', which is a named list whose elements correspond to sequences, in the form of character vectors.


import.msf checks the presence of duplicated identifiers in header. Sequences whose identifiers are missing in header are ignored.


Julien Pele

See Also

read.alignment function from seqinr package.
read.GDoc function from aaMI package (archived).


# reading of the multiple sequence alignment of human GPCRs in MSF format:
aln <- import.msf(system.file("msa/human_gpcr.msf", package = "bios2mds"))

[Package bios2mds version 1.2.3 Index]