gl.spatial.autoCorr {dartR}R Documentation

Spatial autocorrelation following Smouse and Peakall 1999

Description

Global spatial autocorrelation is a multivariate approach combining all loci into a single analysis. The autocorrelation coefficient "r" is calculated for each pair of individuals in each specified distance class. For more information see Smouse and Peakall 1999, Peakall et al. 2003 and Smouse et al. 2008.

Usage

gl.spatial.autoCorr(
  x = NULL,
  Dgeo = NULL,
  Dgen = NULL,
  coordinates = "latlon",
  Dgen_method = "Euclidean",
  Dgeo_trans = "Dgeo",
  Dgen_trans = "Dgen",
  bins = 5,
  reps = 100,
  plot.pops.together = FALSE,
  permutation = TRUE,
  bootstrap = TRUE,
  plot_theme = NULL,
  plot_colors_pop = NULL,
  CI_color = "red",
  plot.out = TRUE,
  save2tmp = FALSE,
  verbose = NULL
)

Arguments

x

Genlight object [default NULL].

Dgeo

Geographic distance matrix if no genlight object is provided. This is typically an Euclidean distance but it can be any meaningful (geographical) distance metrics [default NULL].

Dgen

Genetic distance matrix if no genlight object is provided [default NULL].

coordinates

Can be either 'latlon', 'xy' or a two column data.frame with column names 'lat','lon', 'x', 'y') Coordinates are provided via gl@other$latlon ['latlon'] or via gl@other$xy ['xy']. If latlon data will be projected to meters using Mercator system [google maps] or if xy then distance is directly calculated on the coordinates [default "latlon"].

Dgen_method

Method to calculate genetic distances. See details [default "Euclidean"].

Dgeo_trans

Transformation to be used on the geographic distances. See Dgen_trans [default "Dgeo"].

Dgen_trans

You can provide a formula to transform the genetic distance. The transformation can be applied as a formula using Dgen as the variable to be transformed. For example: Dgen_trans = 'Dgen/(1-Dgen)'. Any valid R expression can be used here [default 'Dgen', which is the identity function.]

bins

The number of bins for the distance classes (i.e. length(bins) == 1) or a vectors with the break points. See details [default 5].

reps

The number to be used for permutation and bootstrap analyses [default 100].

plot.pops.together

Plot all the populations in one plot. Confidence intervals from permutations are not shown [default FALSE].

permutation

Whether permutation calculations for the null hypothesis of no spatial structure should be carried out [default TRUE].

bootstrap

Whether bootstrap calculations to compute the 95% confidence intervals around r should be carried out [default TRUE].

plot_theme

Theme for the plot. See details [default NULL].

plot_colors_pop

A color palette for populations or a list with as many colors as there are populations in the dataset [default NULL].

CI_color

Color for the shade of the 95% confidence intervals around the r estimates [default "red"].

plot.out

Specify if plot is to be produced [default TRUE].

save2tmp

If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default NULL, unless specified using gl.set.verbosity].

Details

This function executes a modified version of spautocorr from the package PopGenReport. Differently from PopGenReport, this function also computes the 95% confidence intervals around the r via bootstraps, the 95 null hypothesis of no spatial structure and the one-tail test via permutation, and the correction factor described by Peakall et al 2003.

The input can be i) a genlight object (which has to have the latlon slot populated), ii) a pair of Dgeo and Dgen, which have to be either matrix or dist objects, or iii) a list of the matrix or dist objects if the analysis needs to be carried out for multiple populations (in this case, all the elements of the list have to be of the same class (i.e. matrix or dist) and the population order in the two lists has to be the same.

If the input is a genlight object, the function calculates the linear distance for Dgeo and the relevant Dgen matrix (see Dgen_method) for each population. When the method selected is a genetic similarity matrix (e.g. "simple" distance), the matrix is internally transformed with 1 - Dgen so that positive values of autocorrelation coefficients indicates more related individuals similarly as implemented in GenAlEx. If the user provide the distance matrices, care must be taken in interpreting the results because similarity matrix will generate negative values for closely related individuals.

If max(Dgeo)>1000 (e.g. the geographic distances are in thousands of metres), values are divided by 1000 (in the example before these would then become km) to facilitate readability of the plots.

If bins is of length = 1 it is interpreted as the number of (even) bins to use. In this case the starting point is always the minimum value in the distance matrix, and the last is the maximum. If it is a numeric vector of length>1, it is interpreted as the breaking points. In this case, the first has to be the lowest value, and the last has to be the highest. There are no internal checks for this and it is user responsibility to ensure that distance classes are properly set up. If that is not the case, data that fall outside the range provided will be dropped. The number of bins will be length(bins) - 1.

The permutation constructs the 95% confidence intervals around the null hypothesis of no spatial structure (this is a two-tail test). The same data are also used to calculate the probability of the one-tail test (See references below for details).

Bootstrap calculations are skipped and NA is returned when the number of possible combinations given the sample size of any given distance class is < reps.

Methods available to calculate genetic distances for SNP data:

Methods available to calculate genetic distances for SilicoDArT data:

Plots and table are saved to the temporal directory (tempdir) and can be accessed with the function gl.print.reports and listed with the function gl.list.reports. Note that they can be accessed only in the current R session because tempdir is cleared each time that the R session is closed.

Examples of other themes that can be used can be consulted in

Value

Returns a data frame with the following columns:

  1. Bin The distance classes

  2. N The number of pairwise comparisons within each distance class

  3. r.uc The uncorrected autocorrelation coefficient

  4. Correction the correction

  5. r The corrected autocorrelation coefficient

  6. L.r The corrected autocorrelation coefficient lower limit (if bootstap = TRUE)

  7. U.r The corrected autocorrelation coefficient upper limit (if bootstap = TRUE)

  8. L.r.null.uc The uncorrected lower limit for the null hypothesis of no spatial autocorrelation (if permutation = TRUE)

  9. U.r.null.uc The uncorrected upper limit for the null hypothesis of no spatial autocorrelation (if permutation = TRUE)

  10. L.r.null The corrected lower limit for the null hypothesis of no spatial autocorrelation (if permutation = TRUE)

  11. U.r.null The corrected upper limit for the null hypothesis of no spatial autocorrelation (if permutation = TRUE)

  12. p.one.tail The p value of the one tail statistical test

Author(s)

Carlo Pacioni, Bernd Gruber & Luis Mijangos (Post to https://groups.google.com/d/forum/dartr)

References

Examples


require("dartR.data")
res <- gl.spatial.autoCorr(platypus.gl, bins=seq(0,10000,2000))
# using one population, showing sample size
test <- gl.keep.pop(platypus.gl,pop.list = "TENTERFIELD")
res <- gl.spatial.autoCorr(test, bins=seq(0,10000,2000),CI_color = "green")

test <- gl.keep.pop(platypus.gl,pop.list = "TENTERFIELD")
res <- gl.spatial.autoCorr(test, bins=seq(0,10000,2000),CI_color = "green")

[Package dartR version 2.9.7 Index]