gl.run.epos {dartR.popgen}R Documentation

Run EPOS for Inference of Historical Population-Size Changes

Description

This function runs EPOS (based on Lynch et al. 2019) to estimate historical population-size https://github.com/EvolBioInf/epos. It relies on a compiled version of the software epos, epos2plot and if a bootstrap output is required bootSfs. For more information on the approach check the publication (Lynch at al. 2019), the github repository https://github.com/EvolBioInf/epos and look out for the manual epos.pdf (https://github.com/EvolBioInf/epos/blob/master/doc/epos.pdf. The binaries need to be provided in a single folder and can be downloaded via the gl.download.binary function (including the necessary dlls for windows; under Linux gls, blas need to be installed on your system). Please note: if you use this method, make sure you cite the original publication in your work. EPOS (Estimation of Population Size changes) is a software tool developed based on the theoretical framework outlined by Lynch et al. (2019). It is designed to infer historical changes in population size using allele-frequency data obtained from population-genomic surveys. Below is a brief summary of the main concepts of EPOS:

EPOS (Estimation of Population Size changes) is a software tool that infers historical changes in population size using allele-frequency data from population-genomic surveys. The method relies on the site-frequency spectrum (SFS) of nearly neutral polymorphisms. The underlying theory uses coalescence models, which describe how gene sequences have originated from a common ancestor. By analyzing the probability distributions of the starting and ending points of branch segments over all possible coalescence trees, EPOS can estimate historic population sizes.
The function uses a model-flexible approach, meaning it estimates historic population sizes, without the necessity to provide a candidate scenario. An efficient statistical procedure is employed, to estimate historic effective population sizes.
For all the possible settings, please refer to the manual of EPOS.
The main parameters that are necessary to run the function are a genlight/dartR object, L (length of sequences), u (mutation rate), and the path to the epos binaries. For details check the example below.
Please note: There is currently not really a good way to estimate L, the length of all sequences. Often users of dart data use the number of loci multiplied by 69, but this is definitely an underestimate as monomorphic loci need to be included (also the length of the restriction site should be added for each loci). For mutation rate u, the default value is set to 5e-9, but should be adapted to the species of interest. The good news is, that settings of L and mu affects only the axis of the inferred history, but not the shape of the history. So users can infer the shape, but need to be careful with a temporal interpretation as both x and y axis are affected by the mutation rate and L.

Usage

gl.run.epos(
  x,
  epos.path,
  sfs = NULL,
  minbinsize = 1,
  folded = TRUE,
  L = NULL,
  u = NULL,
  boot = 0,
  upper = 0.975,
  lower = 0.025,
  method = "greedy",
  depth = 2,
  other.options = "",
  cleanup = TRUE,
  plot.display = TRUE,
  plot.theme = theme_dartR(),
  plot.dir = NULL,
  plot.file = NULL,
  verbose = NULL
)

Arguments

x

dartR/genlight object

epos.path

path to epos and other required programs (epos, epos2plot are always required and bootSfs in case a bootstrap and confidence estimate is required )

sfs

if no sfs is provided function gl.sfs(x, minbinsize=1, singlepop=TRUE) is used to calculate the sfs that is provided to epos

minbinsize

remove bins from the left of the sfs. if you run epos from a genlight object the sfs is calculated by the function (using gl.sfs) and as default minbinsize is set to 1 (the monomorphic loci of the sfs are removed). This parameter is ignored if sfs is provide via the sfs parameter (see below). Be aware even if you genlight object has more than one population the sfs is calculated with singlepop set to true (one sfs for all individuals) as epos does not work with multidimensional sfs)

folded

if set to TRUE (default) a folded sfs (minor allele frequency sfs) is returned. If set to FALSE then an unfolded (derived allele frequency sfs) is returned. It is assumed that 0 is homozygote for the reference and 2 is homozygote for the derived allele. So you need to make sure your coding is correct. option -U in epos.

L

length of sequences (including monomorphic and polymorphic sites). If the sfs is provided with minbinsize=1 (default) then L needs to be specified. option -l in epos

u

mutation rate. If not provided the default value of epos is used (5e-9). option -u in epos

boot

if set to a value >0 the programm bootSfs is used to provide multiple bootstrapped sfs, which allows to calculate confidence intervals of the historic Ne sizes. Be aware the runtime can be extended. default:0 no bootstrapped simulations are run, otherwise boot number of bootstraps are run (option -i in bootSfs)

upper

upper quantile of the bootstrap (only used if boot>0). default 0.975. (option -u in epos2plot)

lower

lower quantile of the bootstrap (only used if boot>0). default 0.025. (option -l in epos2plot)

method

either "exhaustive" or "greedy". check the epos manual for details. If method="exhaustive" then the paramter depth is used. default: "greedy".

depth

if method="exhaustive" then this parameter is used to set the search depth, default is 2. If method is set to greedy this is setting is ignored.

other.options

additional options for epos (e.g -m, -x etc.)

cleanup

if set to true intermediate tempfiles are deleted after the run

plot.display

Specify if plot is to be produced [default TRUE].

plot.theme

User specified theme [default theme_dartR()].

plot.dir

Directory to save the plot RDS files [default as specified by the global working directory or tempdir()]

plot.file

Filename (minus extension) for the RDS plot file [Required for plot save]

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].

Value

returns a list with two components:

Author(s)

Custodian: Bernd Gruber – Post to https://groups.google.com/d/forum/dartr

References

Lynch, Michael, Bernhard Haubold, Peter Pfaffelhuber, and Takahiro Maruki. 2019. Inference of Historical Population-Size Changes with Allele-Frequency Data. G3: Genes|Genomes|Genetics 10, no. 1: 211–23. doi:10.1534/g3.119.400854.

Examples

## Not run: 
#gl.download.binary("epos",os="windows")
require(dartR.data)
epos <- gl.run.epos(possums.gl, epos.path = file.path(tempdir(),"epos"), L=1e5, u = 1e-8)
epos$history

## End(Not run)



[Package dartR.popgen version 1.0.0 Index]