R: Difference score

dif {bios2mds}

R Documentation

Difference score

Description

Measures the difference score between two aligned amino acid or nucleotide sequences.

Usage

dif(seq1, seq2, gap = FALSE, aa.strict = FALSE)

Arguments

`seq1`	a character vector representing a first sequence.
`seq2`	a character vector representing a second sequence.
`gap`	a boolean indicating whether the gap character should be taken as a supplementary symbol (TRUE) or not (FALSE). Default is FALSE.
`aa.strict`	a boolean indicating whether only strict amino acids should be taken into account (TRUE) or not (FALSE). Default is FALSE.

Details

The difference score between two aligned sequences is given by the proportion of sites that differs and is equivalent to 1 - {PID} (percent identity). dif is given by the number of aligned positions (sites) whose symbols differ, divided by the number of aligned positions. dif is equivalent to the p distance defined by Nei and Zhang (2006). In dif, positions with at least one gap can be excluded (gap = FALSE). When gaps are taken as a supplementary symbol (gap = TRUE), sites with gaps in both sequences are excluded.

From Nei and Zhang (2006), the p distance, which is the proportion of sites that differ between two sequences, is estimated by:

{p} = \frac{n_d}{n},

where n is the number of sites and n_d is the number of sites with different symbols.

The difference score ranges from 0, for identical sequences, to 1, for completely different sequences.

Value

A single numeric value representing the difference score.

Author(s)

Julien Pele

References

May AC (2004) Percent sequence identity: the need to be explicit. Structure 12:737-738.

Nei M and Zhang J (2006) Evolutionary Distance: Estimation. Encyclopedia of Life Sciences.

Nei M and Kumar S (2000) Molecular Evolution and Phylogenetics. Oxford University Press, New York.

Examples

# calculating the difference score between the sequences 
# of CLTR1_HUMAN and CLTR2_HUMAN:
aln <- import.fasta(system.file("msa/human_gpcr.fa", package = "bios2mds"))
dif <- dif(aln$CLTR1_HUMAN, aln$CLTR2_HUMAN)
dif

[Package bios2mds version 1.2.3 Index]