gl.run.faststructure {dartR.popgen} | R Documentation |
Runs a faststructure analysis using a genlight object
Description
This function takes a genlight object and runs a faststructure analysis.
Usage
gl.run.faststructure(
x,
k.range,
num.k.rep = 1,
exec = "./fastStructure",
exec.plink = getwd(),
output = getwd(),
tol = 1e-05,
prior = "simple",
cv = 0,
seed = NULL
)
Arguments
x |
Name of the genlight object containing the SNP data [required]. |
k.range |
Range of the number of populations [required]. |
num.k.rep |
Number of replicates [default 1]. |
exec |
Full path and name+extension where the fastStructure executable is located [default working directory "./fastStructure"]. |
exec.plink |
path to plink executable [default working directory]. |
output |
Path to output file [default getwd()]. |
tol |
Convergence criterion [default 10e-6]. |
prior |
Choice of prior: simple or logistic [default "simple"]. |
cv |
Number of test sets for cross-validation, 0 implies no CV step [default 0]. |
seed |
Seed for random number generator [default NULL]. |
Details
Download faststructure binary for your system from here (only runs on Mac or Linux):
https://github.com/StuntsPT/Structure_threader/tree/master/structure_threader/bins
Move faststructure file to working directory. Make file executable using terminal app.
system(paste0("chmod u+x ",getwd(), "/faststructure"))
Download plink binary for your system from here:
https://www.cog-genomics.org/plink/
Move plink file to working directory. Make file executable using terminal app.
system(paste0("chmod u+x ",getwd(), "/plink"))
To install fastStructure dependencies follow these directions: https://github.com/rajanil/fastStructure
fastStructure performs inference for the simplest, independent-loci, admixture model, with two choices of priors that can be specified using the –prior parameter. Thus, unlike Structure, fastStructure does not require the mainparams and extraparam files. The inference algorithm used by fastStructure is fundamentally different from that of Structure and requires the setting of far fewer options.
To identify the number of populations that best approximates the marginal likelihood of the data, the marginal likelihood is extracted from each run of K, averaged across replications and plotted.
Value
A list in which each list entry is a single faststructure run output (there are k.range * num.k.rep number of runs).
Author(s)
Luis Mijangos (Post to https://groups.google.com/d/forum/dartr)
References
Raj, A., Stephens, M., & Pritchard, J. K. (2014). fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics, 197(2), 573-589.
Examples
## Not run:
# Please note: faststructure needs to be installed
# Please note: faststructure is not available for windows
t1 <- gl.filter.callrate(platypus.gl, threshold = 1)
res <- gl.run.faststructure(t1,
exec = "./fastStructure", k.range = 2:3,
num.k.rep = 2, output = paste0(getwd(), "/res_str")
)
qmat <- gl.plot.faststructure(res, k.range = 2:3)
gl.map.structure(qmat, K = 2, t1, scalex = 1, scaley = 0.5)
## End(Not run)