gl.run.structure {dartR.popgen} | R Documentation |
Runs a STRUCTURE analysis using a genlight object
Description
This function takes a genlight object and runs a STRUCTURE analysis based on
functions from strataG
Usage
gl.run.structure(
x,
exec = "./structure",
k.range = NULL,
num.k.rep = 1,
burnin = 1000,
numreps = 1000,
noadmix = TRUE,
freqscorr = FALSE,
randomize = TRUE,
seed = 0,
pop.prior = NULL,
locpriorinit = 1,
maxlocprior = 20,
gensback = 2,
migrprior = 0.05,
pfrompopflagonly = TRUE,
popflag = NULL,
inferalpha = FALSE,
alpha = 1,
unifprioralpha = TRUE,
alphamax = 20,
alphapriora = 0.05,
alphapriorb = 0.001,
plot.out = TRUE,
plot_theme = theme_dartR(),
plot.dir = tempdir(),
plot.file = NULL,
verbose = NULL
)
Arguments
x |
Name of the genlight object containing the SNP data [required]. |
exec |
Full path and name+extension where the structure executable is
located. E.g. |
k.range |
Range of the number of populations [required]. |
num.k.rep |
Number of replicates [default 1]. |
burnin |
Number of iterations for MCMC burnin [default 1000]. |
numreps |
Number of MCMC replicates [default 1000]. |
noadmix |
Logical. No admixture? [default TRUE]. |
freqscorr |
Logical. Correlated frequencies? [default FALSE]. |
randomize |
Randomize [default TRUE]. |
seed |
Set random seed [default 0]. |
pop.prior |
A character specifying which population prior model to use: "locprior" or "usepopinfo" [default NULL]. |
locpriorinit |
Parameterizes locprior parameter r - how informative the populations are. Only used when pop.prior = "locprior" [default 1]. |
maxlocprior |
Specifies range of locprior parameter r. Only used when pop.prior = "locprior" [default 20]. |
gensback |
Integer defining the number of generations back to test for immigrant ancestry. Only used when pop.prior = "usepopinfo" [default 2]. |
migrprior |
Numeric between 0 and 1 listing migration prior. Only used when pop.prior = "usepopinfo" [default 0.05]. |
pfrompopflagonly |
Logical. update allele frequencies from individuals specified by popflag. Only used when pop.prior = "usepopinfo" [default TRUE]. |
popflag |
A vector of integers (0, 1) or logicals identifiying whether or not to use strata information. Only used when pop.prior = "usepopinfo" [default NULL]. |
inferalpha |
Logical. Infer the value of the model parameter # from the data; otherwise is fixed at the value alpha which is chosen by the user. This option is ignored under the NOADMIX model. Small alpha implies that most individuals are essentially from one population or another, while alpha > 1 implies that most individuals are admixed [default FALSE]. |
alpha |
Dirichlet parameter for degree of admixture. This is the initial value if inferalpha = TRUE [default 1]. |
unifprioralpha |
Logical. Assume a uniform prior for alpha which runs between 0 and alphamax. This model seems to work fine; the alternative model (when unfprioralpha = 0) is to take alpha as having a Gamma prior, with mean alphapriora × alphapriorb, and variance alphapriora × alphapriorb^2 [default TRUE]. |
alphamax |
Maximum for uniform prior on alpha when unifprioralpha = TRUE [default 20]. |
alphapriora |
Parameters of Gamma prior on alpha when unifprioralpha = FALSE [default 0.05]. |
alphapriorb |
Parameters of Gamma prior on alpha when unifprioralpha = FALSE [default 0.001]. |
plot.out |
Create an Evanno plot once finished. Be aware k.range needs to be at least three different k steps [default TRUE]. |
plot_theme |
Theme for the plot. See details for options [default theme_dartR()]. |
plot.dir |
Directory to save the plot RDS files [default as specified by the global working directory or tempdir()] |
plot.file |
Name for the RDS binary file to save (base name only, exclude extension) [default NULL] |
verbose |
Set verbosity for this function (though structure output cannot be switched off currently) [default NULL]. |
Details
The function is basically a convenient wrapper around the beautiful
strataG function structureRun
(Archer et al. 2016). For a detailed
description please refer to this package (see references below).
Before running STRUCTURE, we suggest reading its manual (see link below) and the literature in mentioned in the references section.
https://web.stanford.edu/group/pritchardlab/structure_software/release_versions/v2.3.4/structure_doc.pdf To make use of this function you need to download STRUCTURE for you system (non GUI version) from here STRUCTURE.
Format note
For this function to work, make sure that individual and population names
have no spaces. To substitute spaces by underscores you could use the R
function gsub
as below.
popNames(gl) <- gsub(" ","_",popNames(gl));
indNames(gl) <- gsub(" ","_",indNames(gl))
It's also worth noting that Structure truncates individual names at 11
characters. The function will fail if the names of individuals are not unique
after truncation. To avoid this possible problem, a number sequence, as
shown in the code below, might be used instead of individual names.
indNames(gl) <- as.character(1:length(indNames(gl)))
Value
An sr object (structure.result list output). Each list entry is a
single structurerun output (there are k.range * num.k.rep number of runs).
For example the summary output of the first run can be accessed via
sr[[1]]$summary
or the q-matrix of the third run via
sr[[3]]$q.mat
. To conveniently summarise the outputs across runs
(clumpp) you need to run gl.plot.structure on the returned sr object. For
Evanno plots run gl.evanno on your sr object.
Author(s)
Bernd Gruber (Post to https://groups.google.com/d/forum/dartr)
References
Pritchard, J.K., Stephens, M., Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945-959.
Archer, F. I., Adams, P. E. and Schneiders, B. B. (2016) strataG: An R package for manipulating, summarizing and analysing population genetic data. Mol Ecol Resour. doi:10.1111/1755-0998.12559
Wang, Jinliang. "The computer program structure for assigning individuals to populations: easy to use but easier to misuse." Molecular ecology resources 17.5 (2017): 981-990.
Lawson, Daniel J., Lucy Van Dorp, and Daniel Falush. "A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots." Nature communications 9.1 (2018): 3258.
Porras-Hurtado, Liliana, et al. "An overview of STRUCTURE: applications, parameter settings, and supporting software." Frontiers in genetics 4 (2013): 98.
Examples
# examples need structure to be installed on the system (see above)
## Not run:
bc <- bandicoot.gl[,1:100]
sr <- gl.run.structure(bc, k.range = 2:5, num.k.rep = 3,
exec = './structure.exe')
ev <- gl.evanno(sr)
ev
qmat <- gl.plot.structure(sr, K=3)
head(qmat)
gl.map.structure(qmat, bc, scalex=1, scaley=0.5)
## End(Not run)