EstDynamics {TE}R Documentation

Estimate TE dynamics using mismatch data

Description

Given the number of mismatches and element lengths for an LTR retrotransposon family, estimate the age distribution, insertion rate, and deletion rates.

Usage

EstDynamics(mismatch, len, r = 0.013, perturb = 2, rateRange = NULL,
  plotFit = FALSE, plotSensitivity = FALSE, pause = plotFit &&
  plotSensitivity, main = sprintf("n = %d", n))

EstDynamics2(mismatch, len, r = 0.013, nTrial = 10L, perturb = 2,
  rateRange = NULL, plotFit = FALSE, plotSensitivity = FALSE,
  pause = plotFit && plotSensitivity, ...)

Arguments

mismatch

A vector containing the number of mismatches.

len

A vector containing the length of each element.

r

Mutation rate (substitutions/(million year * site)) used in the calculation.

perturb

A scalar multiple to perturb the estimated death rate from the null hypothesis estimate. Used to generate the sensitivity analysis.

rateRange

A vector of death rates, an alternative to perturb for specifying the death rates.

plotFit

Whether to plot the distribution fits.

plotSensitivity

Whether to plot the sensitivity analysis.

pause

Whether to pause after each plot.

main

The title for the plot.

nTrial

The number of starting points for searching for the MLE.

...

Pass to EstDynamics

Details

EstDynamics estimates the TE dynamics through fitting a negative binomial fit to the mismatch data, while EstDynamics2 uses a mixture model. For detailed implementation see References.

Value

EstDynamics returns a TEfit object, containing the following fields, where the unit for time is million years ago (Mya):

pvalue

The p-value for testing H_0: The insertion rate is uniform over time.

ageDist

A list containing the estimated age distributions.

insRt

A list containing the estimated insertion rates.

agePeakLoc

The maximum point (in age) of the age distribution.

insPeakLoc

The maximum point (in time) of the insertion rate.

estimates

The parameter estimates from fitting the distributions; see References

sensitivity

A list containing the results for the sensitivity analysis, with fields time: time points; delRateRange: A vector for the range of deletion rates; insRange: A matrix whose columns contain the insertion rates under different scenarios.

n

The sample size.

meanLen

The mean of element length.

meanDiv

The mean of divergence.

KDE

A list containing the kernel density estimate for the mismatch data.

logLik

The log-likelihoods of the parametric fits.

This function returns a TEfit2 object, containing all the above fields for TEfit and the following:

estimates2

The parameter estimates from fitting the mixture distribution.

ageDist2

The estimated age distribution from fitting the mixture distribution.

insRt2

The estimated insertion rate from fitting the mixture distribution.

agePeakLoc2

Maximum point(s) for the age distribution.

insPeakLoc2

Maximum point(s) for the insertion rate.

References

Dai, X., Wang, H., Dvorak, J., Bennetzen, J., Mueller, H.-G. (2018). "Birth and Death of LTR Retrotransposons in Aegilops tauschii". Genetics

Examples

# Analyze Gypsy family 24 (Nusif)
data(AetLTR)
dat <- subset(AetLTR, GroupID == 24 & !is.na(Chr))
set.seed(1)
res1 <- EstDynamics(dat$Mismatch, dat$UngapedLen, plotFit=TRUE, plotSensitivity=FALSE, pause=FALSE)

# p-value for testing a uniform insertion rate
res1$pvalue


# Use a mixture distribution to improve fit
res2 <- EstDynamics2(dat$Mismatch, dat$UngapedLen, plotFit=TRUE)

# A larger number of trials is recommended to achieve the global MLE
## Not run: 
res3 <- EstDynamics2(dat$Mismatch, dat$UngapedLen, plotFit=TRUE, nTrial=1000L)

## End(Not run)

[Package TE version 0.3-0 Index]