## Difference score

### Description

Measures the difference score between two aligned amino acid or nucleotide sequences.

### Usage

```
dif(seq1, seq2, gap = FALSE, aa.strict = FALSE)
```

### Arguments

`seq1` |
a character vector representing a first sequence. |

`seq2` |
a character vector representing a second sequence. |

`gap` |
a boolean indicating whether the gap character should be taken as a supplementary symbol (TRUE) or not (FALSE). Default is FALSE. |

`aa.strict` |
a boolean indicating whether only strict amino acids should be taken into account (TRUE) or not (FALSE). Default is FALSE. |

### Details

The difference score between two aligned sequences is given by the proportion of sites that differs and is equivalent to `1 - {PID}`

(percent identity).
`dif`

is given by the number of aligned positions (sites) whose symbols differ, divided by the number of aligned positions. `dif`

is equivalent to the *p* distance defined by Nei and Zhang (2006).
In `dif`

, positions with at least one gap can be excluded (gap = FALSE). When gaps are taken as a supplementary symbol (gap = TRUE), sites with gaps in both sequences are excluded.

From Nei and Zhang (2006), the *p* distance, which is the proportion of sites that differ between
two sequences, is estimated by:

`{p} = \frac{n_d}{n},`

where *n* is the number of sites and * n_d* is the number of sites with different symbols.

The difference score ranges from 0, for identical sequences, to 1, for completely different sequences.

### Value

A single numeric value representing the difference score.

### Author(s)

Julien Pele

### Examples

```
# calculating the difference score between the sequences
# of CLTR1_HUMAN and CLTR2_HUMAN:
aln <- import.fasta(system.file("msa/human_gpcr.fa", package = "bios2mds"))
dif <- dif(aln$CLTR1_HUMAN, aln$CLTR2_HUMAN)
dif
```

