get_pos_based_seq_weights {BALCONY}R Documentation

Get position based weights of sequences in alignment

Description

This function calculates position based weights of sequences based on Heinkoff & Heinkoff (1994) for given MSA. The score is calculated as sum of scores for each sequence position c. Score for position c is equal 1/r if there is r different residues at column c in MSA but 1/rs if r symbol is repeated in s sequences.

Usage

get_pos_based_seq_weights(alignment, gap=TRUE, normalized=TRUE)

Arguments

alignment

alignment loaded with read.alignment

gap

(optional) a logical parameter, if TRUE(default) the gaps in MSA are included

normalized

(optional) logical parameter, if TRUE (default) weights for all sequences are divided by number of columns in alignment (when gap = TRUE weights sum up to 1)

Details

The weights might be calculated only for amino acids symbols or for all symbols (including gaps). Also weights can be normalized by number of columns in MSA, then the sum of weights for all sequences is 1.

Value

weights

a vector of position based weights for each sequence in given alignment

Author(s)

Alicja Pluciennik & Michal Stolarczyk

References

Henikoff, S. & Henikoff, J. G. Position-based sequence weights. Journal of Molecular Biology 243, 574–578 (1994).

Examples

data("small_alignment")
pos_based_weights <- get_pos_based_seq_weights(small_alignment)

[Package BALCONY version 0.2.10 Index]