AhRs {partitionMetric} | R Documentation |
Sample data for partitionMetric
Description
This small dataset contains aligned protein sequences for seven alleles of the aryl hydrocarbon receptor (AhR).
Usage
data(AhRs)
Format
The format is a character matrix in which column represents
the
'th position in the alignment, and contains an amino
acid code or "-" indicating an indel. Row names contain the
animal species.
Details
A DNA or protein sequence has an associated index set
that labels the
positions of the nucleotides or amino acids (AA).
This index set can be partitioned such that all members referring to
the same AA share a homogeneous partition.
For example, given the sequence
ATGTA
and its index
set , the "A" partition
contains the subset
, the "T" partition contains
, and so on.
Given two aligned sequences and their respective partitions of the
index set, a metric distance between these partitions can be computed. See
partitionMetric
for such a metric, along with an example
of clustering this AhR dataset.
Source
This dataset was derived from NCBI HomoloGene:1224.
References
Mark Hahn, Aryl hydrocarbon receptors: diversity and evolution. Chem Biol Interact, 2002, 141, 131-160