AhRs {partitionMetric}R Documentation

Sample data for partitionMetric

Description

This small dataset contains aligned protein sequences for seven alleles of the aryl hydrocarbon receptor (AhR).

Usage

data(AhRs)

Format

The format is a character matrix in which column ii represents the ii'th position in the alignment, and contains an amino acid code or "-" indicating an indel. Row names contain the animal species.

Details

A DNA or protein sequence has an associated index set {1,2,,n}\{1,2,\ldots,n\} that labels the nn positions of the nucleotides or amino acids (AA). This index set can be partitioned such that all members referring to the same AA share a homogeneous partition. For example, given the sequence ⁠ATGTA⁠ and its index set {1,2,,5}\{1,2,\ldots,5\}, the "A" partition contains the subset {1,5}\{1,5\}, the "T" partition contains {2,4}\{2,4\}, and so on.

Given two aligned sequences and their respective partitions of the index set, a metric distance between these partitions can be computed. See partitionMetric for such a metric, along with an example of clustering this AhR dataset.

Source

This dataset was derived from NCBI HomoloGene:1224.

References

Mark Hahn, Aryl hydrocarbon receptors: diversity and evolution. Chem Biol Interact, 2002, 141, 131-160


[Package partitionMetric version 1.1 Index]