gl.spatial.autoCorr {dartR} | R Documentation |
Spatial autocorrelation following Smouse and Peakall 1999
Description
Global spatial autocorrelation is a multivariate approach combining all loci into a single analysis. The autocorrelation coefficient "r" is calculated for each pair of individuals in each specified distance class. For more information see Smouse and Peakall 1999, Peakall et al. 2003 and Smouse et al. 2008.
Usage
gl.spatial.autoCorr(
x = NULL,
Dgeo = NULL,
Dgen = NULL,
coordinates = "latlon",
Dgen_method = "Euclidean",
Dgeo_trans = "Dgeo",
Dgen_trans = "Dgen",
bins = 5,
reps = 100,
plot.pops.together = FALSE,
permutation = TRUE,
bootstrap = TRUE,
plot_theme = NULL,
plot_colors_pop = NULL,
CI_color = "red",
plot.out = TRUE,
save2tmp = FALSE,
verbose = NULL
)
Arguments
x |
Genlight object [default NULL]. |
Dgeo |
Geographic distance matrix if no genlight object is provided. This is typically an Euclidean distance but it can be any meaningful (geographical) distance metrics [default NULL]. |
Dgen |
Genetic distance matrix if no genlight object is provided [default NULL]. |
coordinates |
Can be either 'latlon', 'xy' or a two column data.frame
with column names 'lat','lon', 'x', 'y') Coordinates are provided via
|
Dgen_method |
Method to calculate genetic distances. See details [default "Euclidean"]. |
Dgeo_trans |
Transformation to be used on the geographic distances. See Dgen_trans [default "Dgeo"]. |
Dgen_trans |
You can provide a formula to transform the genetic
distance. The transformation can be applied as a formula using Dgen as the
variable to be transformed. For example: |
bins |
The number of bins for the distance classes
(i.e. |
reps |
The number to be used for permutation and bootstrap analyses [default 100]. |
plot.pops.together |
Plot all the populations in one plot. Confidence intervals from permutations are not shown [default FALSE]. |
permutation |
Whether permutation calculations for the null hypothesis of no spatial structure should be carried out [default TRUE]. |
bootstrap |
Whether bootstrap calculations to compute the 95% confidence intervals around r should be carried out [default TRUE]. |
plot_theme |
Theme for the plot. See details [default NULL]. |
plot_colors_pop |
A color palette for populations or a list with as many colors as there are populations in the dataset [default NULL]. |
CI_color |
Color for the shade of the 95% confidence intervals around the r estimates [default "red"]. |
plot.out |
Specify if plot is to be produced [default TRUE]. |
save2tmp |
If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE]. |
verbose |
Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default NULL, unless specified using gl.set.verbosity]. |
Details
This function executes a modified version
of spautocorr
from the package PopGenReport
. Differently
from PopGenReport
, this function also computes the 95% confidence
intervals around the r via bootstraps, the 95
null hypothesis of no spatial structure and the one-tail test via permutation,
and the correction factor described by Peakall et al 2003.
The input can be i) a genlight object (which has to have the latlon slot
populated), ii) a pair of Dgeo
and Dgen
, which have to be
either
matrix
or dist
objects, or iii) a list
of the
matrix
or dist
objects if the
analysis needs to be carried out for multiple populations (in this case,
all the elements of the list
have to be of the same class (i.e.
matrix
or dist
) and the population order in the two lists has
to be the same.
If the input is a genlight object, the function calculates the linear
distance
for Dgeo
and the relevant Dgen
matrix (see Dgen_method
)
for each population.
When the method selected is a genetic similarity matrix (e.g. "simple"
distance), the matrix is internally transformed with 1 - Dgen
so that
positive values of autocorrelation coefficients indicates more related
individuals similarly as implemented in GenAlEx. If the user provide the
distance matrices, care must be taken in interpreting the results because
similarity matrix will generate negative values for closely related
individuals.
If max(Dgeo)>1000
(e.g. the geographic distances are in thousands of
metres), values are divided by 1000 (in the example before these would then
become km) to facilitate readability of the plots.
If bins
is of length = 1 it is interpreted as the number of (even)
bins to use. In this case the starting point is always the minimum value in
the distance matrix, and the last is the maximum. If it is a numeric vector
of length>1, it is interpreted as the breaking points. In this case, the
first has to be the lowest value, and the last has to be the highest. There
are no internal checks for this and it is user responsibility to ensure that
distance classes are properly set up. If that is not the case, data that fall
outside the range provided will be dropped. The number of bins will be
length(bins) - 1
.
The permutation constructs the 95% confidence intervals around the null hypothesis of no spatial structure (this is a two-tail test). The same data are also used to calculate the probability of the one-tail test (See references below for details).
Bootstrap calculations are skipped and NA
is returned when the number
of possible combinations given the sample size of any given distance class is
< reps
.
Methods available to calculate genetic distances for SNP data:
"propShared" using the function
gl.propShared
."grm" using the function
gl.grm
."Euclidean" using the function
gl.dist.ind
."Simple" using the function
gl.dist.ind
."Absolute" using the function
gl.dist.ind
."Manhattan" using the function
gl.dist.ind
.
Methods available to calculate genetic distances for SilicoDArT data:
"Euclidean" using the function
gl.dist.ind
."Simple" using the function
gl.dist.ind
."Jaccard" using the function
gl.dist.ind
."Bray-Curtis" using the function
gl.dist.ind
.
Plots and table are saved to the temporal directory (tempdir) and can be
accessed with the function gl.print.reports
and listed with
the function gl.list.reports
. Note that they can be accessed
only in the current R session because tempdir is cleared each time that the
R session is closed.
Examples of other themes that can be used can be consulted in
Value
Returns a data frame with the following columns:
Bin The distance classes
N The number of pairwise comparisons within each distance class
r.uc The uncorrected autocorrelation coefficient
Correction the correction
r The corrected autocorrelation coefficient
L.r The corrected autocorrelation coefficient lower limit (if
bootstap = TRUE
)U.r The corrected autocorrelation coefficient upper limit (if
bootstap = TRUE
)L.r.null.uc The uncorrected lower limit for the null hypothesis of no spatial autocorrelation (if
permutation = TRUE
)U.r.null.uc The uncorrected upper limit for the null hypothesis of no spatial autocorrelation (if
permutation = TRUE
)L.r.null The corrected lower limit for the null hypothesis of no spatial autocorrelation (if
permutation = TRUE
)U.r.null The corrected upper limit for the null hypothesis of no spatial autocorrelation (if
permutation = TRUE
)p.one.tail The p value of the one tail statistical test
Author(s)
Carlo Pacioni, Bernd Gruber & Luis Mijangos (Post to https://groups.google.com/d/forum/dartr)
References
Smouse PE, Peakall R. 1999. Spatial autocorrelation analysis of individual multiallele and multilocus genetic structure. Heredity 82: 561-573.
Double, MC, et al. 2005. Dispersal, philopatry and infidelity: dissecting local genetic structure in superb fairy-wrens (Malurus cyaneus). Evolution 59, 625-635.
Peakall, R, et al. 2003. Spatial autocorrelation analysis offers new insights into gene flow in the Australian bush rat, Rattus fuscipes. Evolution 57, 1182-1195.
Smouse, PE, et al. 2008. A heterogeneity test for fine-scale genetic structure. Molecular Ecology 17, 3389-3400.
Gonzales, E, et al. 2010. The impact of landscape disturbance on spatial genetic structure in the Guanacaste tree, Enterolobium cyclocarpum (Fabaceae). Journal of Heredity 101, 133-143.
Beck, N, et al. 2008. Social constraint and an absence of sex-biased dispersal drive fine-scale genetic structure in white-winged choughs. Molecular Ecology 17, 4346-4358.
Examples
require("dartR.data")
res <- gl.spatial.autoCorr(platypus.gl, bins=seq(0,10000,2000))
# using one population, showing sample size
test <- gl.keep.pop(platypus.gl,pop.list = "TENTERFIELD")
res <- gl.spatial.autoCorr(test, bins=seq(0,10000,2000),CI_color = "green")
test <- gl.keep.pop(platypus.gl,pop.list = "TENTERFIELD")
res <- gl.spatial.autoCorr(test, bins=seq(0,10000,2000),CI_color = "green")