gl.run.stairway2 {dartR.popgen} | R Documentation |
Run Stairway Plot 2 for Demographic History Inference
Description
This function runs Stairway Plot 2 to infer demographic history using folded SNP frequency spectra. Stairway Plot 2 is a method for inferring demographic history using folded SNP frequency spectra. The key features and methodology of Stairway Plot 2 include:
-
Folded SNP Frequency Spectra: The method uses folded SNP frequency spectra, which are less sensitive to errors in ancestral state inference compared to unfolded spectra.
-
Demographic Inference: By analyzing the SNP frequency spectra, Stairway Plot 2 can infer changes in population size over time, providing insights into historical demographic events.
-
Bootstrap Replicates: The method employs bootstrap replicates to estimate confidence intervals for the inferred demographic history, ensuring robust and reliable results.
-
Flexible Modeling: Stairway Plot 2 allows for flexible modeling of demographic history without assuming a specific parametric form for population size changes.
To be able to run Stairway Plot 2, the binaries need to be provided in a single folder and can be downloaded via the gl.download.binary
function. In this case your system need to have Java installed as well. for more details on the method and how to install on your system refer to the githubh repository: https://github.com/xiaoming-liu/stairway-plot-v2. Please also refer to the original publication for more details on the method: doi:10.1186/s13059-020-02196-9. **Also if you use this method, make sure you cite the original publication in your work.**
This function implements the theoretical and computational procedures described by Liu and Fu (2020), making it suitable for a wide range of population-genomic datasets to uncover historical demographic patterns.
Please note: There is currently not really a good way to estimate L, the length
of all sequences. Often users of dart data use the number of loci multiplied
by 69, but this is definitely an underestimate as monomorphic loci need to be
included (also the length of the restriction site should be added for each loci).
For mutation rate u, the default value is set to 5e-9, but should be adapted
to the species of interest. The good news is, that settings of L and mu affects
only the axis of the inferred history, but not the shape of the history.
So users can infer the shape, but need to be careful with a temporal interpretation
as both x and y axis are affected by the mutation rate and L.
Usage
gl.run.stairway2(
x,
L = NULL,
mu = NULL,
stairway2.path,
minbinsize = 1,
maxbinsize = NULL,
gentime = 1,
sfs = NULL,
parallel = 1,
run = TRUE,
blueprint = "blueprint",
filename = "sample",
pct_training = 0.67,
nrand = NULL,
stairway_plot_dir = "stairway_plot_es",
nreps = 200,
seed = NULL,
plot_title = "Ne",
xmin = 0,
xmax = 0,
ymin = 0,
ymax = 0,
xspacing = 2,
yspacing = 2,
fontsize = 12,
cleanup = TRUE,
plot.display = TRUE,
plot.theme = theme_dartR(),
plot.dir = NULL,
plot.file = NULL,
verbose = NULL
)
Arguments
x |
A genlight/dartR object containing SNP data. |
L |
the length of the sequence in base pairs. (see notes below) |
mu |
the mutation rate per base pair per generation. (see notes below) |
stairway2.path |
the path to the Stairway Plot 2 executable. (check the example) |
minbinsize |
the minumum bin size for the SFS that should be used. (default=1) |
maxbinsize |
the maximum bin size for the SFS that should be used. (default=NULL, so the maximum bin size is set to the number of samples in the dataset) |
gentime |
the generation time in years. (default=1) |
sfs |
the folded site frequency spectrum (SFS) to be used for the analysis. If not provided the SFS is created from the genlight/dartR object (default=NULL) |
parallel |
the number of parallel processes to use for the analysis. (default=1) |
run |
logical. If TRUE, the analysis is run immediately. Otherwise only the blueprint files are created [might be useful to run on a cluster]. (default=FALSE) |
blueprint |
the name of the blueprint file. (default="blueprint") |
filename |
the name of the filename. Also used for the plot. (default="sample") |
pct_training |
the percentage of the data to use for training. (default=0.67) |
nrand |
the number of breakpoint to use for the analysis. (default=NULL) |
stairway_plot_dir |
the name of the directory where the stairway plot is saved. (default="stairway_plot_es") |
nreps |
the number of bootstrap replicates to use for the analysis. (default=200) |
seed |
the random seed to use for the analysis. (default=NULL) |
plot_title |
the title of the plot. (default="Ne"+filename) |
xmin |
minimum x value for the plot. (default=0) |
xmax |
maximum x value for the plot. (default=0) |
ymin |
minimum y value for the plot. (default=0) |
ymax |
maximum y value for the plot. (default=0) |
xspacing |
spacing between x values for the plot. (default=2) |
yspacing |
spacing between y values for the plot. (default=2) |
fontsize |
the font size for the plot. (default=12) |
cleanup |
logical. If TRUE, the stairway 2 plot output files are removed. (default=TRUE) |
plot.display |
Specify if plot is to be produced [default TRUE]. |
plot.theme |
User specified theme [default theme_dartR()]. |
plot.dir |
Directory to save the plot RDS files [default as specified by the global working directory or tempdir()] |
plot.file |
Filename (minus extension) for the RDS plot file [Required for plot save] |
verbose |
Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity]. |
Value
returns a list with two components:
history: Ne estimates of over generations (generation, median, low and high)
plot: a ggplot of history
References
Liu, X., & Fu, Y. X. (2020). Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biology, 21(1), 280.
Liu, X., Fu, YX. Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biol 21, 280 (2020). doi:10.1186/s13059-020-02196-9
Examples
## Not run:
#download binary, if not already installed, to tempdir()
gl.download.binary(software="stairway2",os="windows")
require(dartR.data)
sw<- gl.run.stairway2(possums.gl[1:50,1:100], L=1e5, mu = 1e-9,
stairway2.path = file.path(tempdir(),"stairway2"),
parallel=5, nreps = 10)
head(sw$history)
## End(Not run)