read.vhica {vhica}R Documentation

Reads divergence and codon usage data files for the VHICA method.

Description

The VHICA method relies on two sources of information: (i) the divergence between sequences, and (ii) the codon usage bias. This function reads two data files and creates an object of class vhica that can be further explored by plot.vhica and image.vhica. Input can be either (1) two vectors of fasta file names (one for the genes, one for the putatively transfered genes), or (2) already processed files containing codon usage bias and divergence data (see Details).

Usage

read.vhica(gene.fasta=NULL, target.fasta=NULL, 
	cb.filename=NULL, div.filename=NULL, 
	reference = "Gene", divergence = "dS", 
	CUB.method="ENC", div.method="LWL85", div.pairwise=TRUE, 
	div.max.lim=3, species.sep="_", gene.sep=".", family.sep=".", ...)

Arguments

gene.fasta

Sequence files (FASTA format) containing the aligned sequences (respecting the translation phase) for all species of the reference genes.

target.fasta

Sequence files (FASTA format) containing the aligned sequence of the putatively transfered genes.

cb.filename

File name for the codon usage bias data. If FASTA files are provided, this file will be created.

div.filename

File name for the divergence data. If FASTA files are provided, this file will be created.

reference

Name of the reference type in the codon usage file. Default is "Gene".

divergence

Name of the divergence column in the divergence file. Default is "dS".

CUB.method

Method to be used for Codon Usage Bias calculation (see CUB).

div.method

Method to be used for divergence calculation (see div).

div.pairwise

Whether divergence should be calculated from the whole alignment of between pairs of sequences (see div).

div.max.lim

Maximum divergence score. Estimated divergence much larger than 100% are likely to be problematic and should not be considered.

species.sep

Separator for species (or equivalent) labels in sequence names. Any character string following this separator will be disregarded – be careful about potential duplicates.

gene.sep

Separator for gene names from gene sequence files.

family.sep

Separator for target sequence sub-families.

...

Further parameters for the internal function .reference.regression.

Details

Details about CUB and divergence calculations can be found in CUB and div. If CUB and/or divergence need to be calculated by an external program, it is possible to provide them in the following format:

Value

The function returns an object of class vhica, a list containing:

Author(s)

Implementation: Arnaud Le Rouzic
Scientists who designed the method: Gabriel Wallau, Aurelie Hua-Van, Arnaud Le Rouzic.

References

Gabriel Luz Wallau, Arnaud Le Rouzic, Pierre Capy, Elgion Loreto, Aurelie Hua-Van. VHICA: A new method to discriminate between vertical and horizontal transposon transfer: application to the mariner family within Drosophila. Molecular biology and evolution 33 (4), 1094-1109.

See Also

plot.vhica, image.vhica, CUB, div

Examples

file.cb <- system.file("extdata", "mini-cbias.txt", package="vhica")
file.div <- system.file("extdata", "mini-div.txt", package="vhica")
file.tree <- if(require("ape")) system.file("extdata", "phylo.nwk", package="vhica") else NULL
vc <- read.vhica(cb.filename=file.cb, div.filename=file.div)
plot(vc, "dere", "dana")
image(vc, "mellifera:6", treefile=file.tree, skip.void=TRUE)

[Package vhica version 0.2.8 Index]