gl.run.stairway2 {dartR.popgen}R Documentation

Run Stairway Plot 2 for Demographic History Inference

Description

This function runs Stairway Plot 2 to infer demographic history using folded SNP frequency spectra. Stairway Plot 2 is a method for inferring demographic history using folded SNP frequency spectra. The key features and methodology of Stairway Plot 2 include:

To be able to run Stairway Plot 2, the binaries need to be provided in a single folder and can be downloaded via the gl.download.binary function. In this case your system need to have Java installed as well. for more details on the method and how to install on your system refer to the githubh repository: https://github.com/xiaoming-liu/stairway-plot-v2. Please also refer to the original publication for more details on the method: doi:10.1186/s13059-020-02196-9. **Also if you use this method, make sure you cite the original publication in your work.** This function implements the theoretical and computational procedures described by Liu and Fu (2020), making it suitable for a wide range of population-genomic datasets to uncover historical demographic patterns. Please note: There is currently not really a good way to estimate L, the length of all sequences. Often users of dart data use the number of loci multiplied by 69, but this is definitely an underestimate as monomorphic loci need to be included (also the length of the restriction site should be added for each loci). For mutation rate u, the default value is set to 5e-9, but should be adapted to the species of interest. The good news is, that settings of L and mu affects only the axis of the inferred history, but not the shape of the history. So users can infer the shape, but need to be careful with a temporal interpretation as both x and y axis are affected by the mutation rate and L.

Usage

gl.run.stairway2(
  x,
  L = NULL,
  mu = NULL,
  stairway2.path,
  minbinsize = 1,
  maxbinsize = NULL,
  gentime = 1,
  sfs = NULL,
  parallel = 1,
  run = TRUE,
  blueprint = "blueprint",
  filename = "sample",
  pct_training = 0.67,
  nrand = NULL,
  stairway_plot_dir = "stairway_plot_es",
  nreps = 200,
  seed = NULL,
  plot_title = "Ne",
  xmin = 0,
  xmax = 0,
  ymin = 0,
  ymax = 0,
  xspacing = 2,
  yspacing = 2,
  fontsize = 12,
  cleanup = TRUE,
  plot.display = TRUE,
  plot.theme = theme_dartR(),
  plot.dir = NULL,
  plot.file = NULL,
  verbose = NULL
)

Arguments

x

A genlight/dartR object containing SNP data.

L

the length of the sequence in base pairs. (see notes below)

mu

the mutation rate per base pair per generation. (see notes below)

stairway2.path

the path to the Stairway Plot 2 executable. (check the example)

minbinsize

the minumum bin size for the SFS that should be used. (default=1)

maxbinsize

the maximum bin size for the SFS that should be used. (default=NULL, so the maximum bin size is set to the number of samples in the dataset)

gentime

the generation time in years. (default=1)

sfs

the folded site frequency spectrum (SFS) to be used for the analysis. If not provided the SFS is created from the genlight/dartR object (default=NULL)

parallel

the number of parallel processes to use for the analysis. (default=1)

run

logical. If TRUE, the analysis is run immediately. Otherwise only the blueprint files are created [might be useful to run on a cluster]. (default=FALSE)

blueprint

the name of the blueprint file. (default="blueprint")

filename

the name of the filename. Also used for the plot. (default="sample")

pct_training

the percentage of the data to use for training. (default=0.67)

nrand

the number of breakpoint to use for the analysis. (default=NULL)

stairway_plot_dir

the name of the directory where the stairway plot is saved. (default="stairway_plot_es")

nreps

the number of bootstrap replicates to use for the analysis. (default=200)

seed

the random seed to use for the analysis. (default=NULL)

plot_title

the title of the plot. (default="Ne"+filename)

xmin

minimum x value for the plot. (default=0)

xmax

maximum x value for the plot. (default=0)

ymin

minimum y value for the plot. (default=0)

ymax

maximum y value for the plot. (default=0)

xspacing

spacing between x values for the plot. (default=2)

yspacing

spacing between y values for the plot. (default=2)

fontsize

the font size for the plot. (default=12)

cleanup

logical. If TRUE, the stairway 2 plot output files are removed. (default=TRUE)

plot.display

Specify if plot is to be produced [default TRUE].

plot.theme

User specified theme [default theme_dartR()].

plot.dir

Directory to save the plot RDS files [default as specified by the global working directory or tempdir()]

plot.file

Filename (minus extension) for the RDS plot file [Required for plot save]

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].

Value

returns a list with two components:

References

Liu, X., & Fu, Y. X. (2020). Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biology, 21(1), 280.

Liu, X., Fu, YX. Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biol 21, 280 (2020). doi:10.1186/s13059-020-02196-9

Examples

## Not run: 
#download binary, if not already installed, to tempdir()
gl.download.binary(software="stairway2",os="windows")
require(dartR.data)
sw<- gl.run.stairway2(possums.gl[1:50,1:100], L=1e5, mu = 1e-9, 
           stairway2.path = file.path(tempdir(),"stairway2"), 
           parallel=5, nreps = 10)
head(sw$history)

## End(Not run)

[Package dartR.popgen version 1.0.0 Index]