feat_mutation {coala}R Documentation

Feature: Mutation

Description

This feature adds mutations to a model. Mutations occur in the genomes of the individuals with a given rate. The rate is per locus for unlinked loci and per trio for linked locus trios. By default, the same mutation rate is used for all loci, but it is possible to change this with par_variation and par_zero_inflation.

Usage

feat_mutation(
  rate,
  model = "IFS",
  base_frequencies = NA,
  tstv_ratio = NA,
  gtr_rates = NA,
  fixed_number = FALSE,
  locus_group = "all"
)

Arguments

rate

The mutation rate. Can be a numeric or a parameter. The rate is specified as 4 * N0 * mu, where mu is the mutation rate per locus.

model

The mutation model you want to use. Can be either 'IFS' (default), 'HKY' or 'GTR'. Refer to the mutation model section for detailed information.

base_frequencies

The equilibrium frequencies of the four bases used in the 'HKY' mutation model. Must be a numeric vector of length four, with the values for A, C, G and T, in that order.

tstv_ratio

The ratio of transitions to transversions used in the 'HKY' muation model.

gtr_rates

The rates for the six amino acid substitutions used in the 'GTR' model. Must be a numeric vector of length six. Order: A<->C, A<->G, A<->T, C<->G, C<->T, G<->T.

fixed_number

If set to TRUE, the number of mutations on each locus will always be exactly equal to the rate, rather than happening with a rate along the ancestral tree.

locus_group

The loci for which this features is used. Can either be "all" (default), in which case the feature is used for simulating all loci, or a numeric vector. In the latter case, the feature is only used for the loci added in locus_ commands with the corresponding index starting from 1 in order in which the commands where added to the model. For example, if a model has locus_single(10) + locus_averaged(10, 11) + locus_single(12) and this argument is c(2, 3), than the feature is used for all but the first locus (that is locus 2 - 12).

Value

The feature, which can be added to a model using +.

The feature, which can be added to a model created with coal_model using +.

Mutation Models

The infinite sites mutation (IFS) model is a frequently used simplification in population genetics. It assumes that each locus consists of infinitely many sites at which mutations can occur, and each mutation hits a new site. Consequently, there are no back-mutations with this model. It does not generate DNA sequences, but rather only 0/1 coded data, were 0 denotes the ancestral state of the site, and 1 the derived state created by a mutation.

The other mutation models are finite site models that generate more realistic sequences.

The Hasegawa, Kishino and Yano (HKY) model (Hasegawa et al., 1985) allows for a different rate of transitions and transversions (tstv_ratio) and unequal frequencies of the four nucleotides (base_frequencies).

The general reversible process (GTR) model (e.g. Yang, 1994) is more general than the HKY model and allows to define the rates for each type of substitution. The rates are assumed to be symmetric (e.g., the rate for T to G is equal to the one for G to T).

See Also

For using rates that variate between the loci in a model: par_variation, par_zero_inflation

For adding recombination: feat_recombination.

For creating a model: coal_model

Other features: feat_growth(), feat_ignore_singletons(), feat_migration(), feat_outgroup(), feat_pop_merge(), feat_recombination(), feat_selection(), feat_size_change(), feat_unphased()

Examples

# A model with a constant mutation rate of 5:
model <- coal_model(5, 1) + feat_mutation(5) + sumstat_seg_sites()
simulate(model)

# A model with a mutation of 5.0 for the first 10 loci, and 7.5 for the
# second 10 loci
model <- coal_model(4) +
  locus_averaged(10, 100) +
  locus_averaged(10, 100) +
  feat_mutation(5.0, locus_group = 1) +
  feat_mutation(7.5, locus_group = 2) +
  sumstat_seg_sites()
simulate(model)

# A model with 7 mutations per locus:
model <- coal_model(5, 1) +
  feat_mutation(7, fixed = TRUE) +
  sumstat_seg_sites()
simulate(model)

# A model using the HKY model:
model <- coal_model(c(10, 1), 2) +
 feat_mutation(7.5, model = "HKY", tstv_ratio = 2,
               base_frequencies = c(.25, .25, .25, .25)) +
  feat_outgroup(2) +
  feat_pop_merge(1.0, 2, 1) +
  sumstat_seg_sites()
## Not run: simulate(model)

# A model using the GTR model:
model <- coal_model(c(10, 1), 1, 25) +
  feat_mutation(7.5, model = "GTR",
                gtr_rates = c(1, 1, 1, 1, 1, 1) / 6) +
  feat_outgroup(2) +
  feat_pop_merge(1.0, 2, 1) +
  sumstat_dna()
## Not run: simulate(model)$dna

[Package coala version 0.7.2 Index]