matrix.p.sig {PCPS} | R Documentation |
Association between phylogeny-weighted species composition and environmental predictors
Description
Analyses to relate an environmental gradient to the phylogenetic assembly of species across a metacommunity by means of phylogenetic fuzzy weighting.
Usage
matrix.p.sig(
comm,
phylodist,
envir,
checkdata = TRUE,
FUN,
runs = 999,
parallel = NULL,
newname = "pcps",
...
)
pcps.sig(
comm,
phylodist,
envir,
checkdata = TRUE,
method = "bray",
squareroot = TRUE,
FUN,
choices,
runs = 999,
parallel = NULL,
newname = "pcps",
...
)
FUN.ADONIS(x, envir, method.p, sqrt.p = TRUE, formula, return.model = FALSE)
FUN.GLM(x, envir, formula, ..., return.model = FALSE)
FUN.GLS.marginal(x, envir, formula, ..., return.model = FALSE)
FUN.GLS.sequential(x, envir, formula, ..., return.model = FALSE)
FUN.LME.marginal(x, envir, formula, ..., return.model = FALSE)
FUN.LME.sequential(x, envir, formula, ..., return.model = FALSE)
FUN.MANTEL(
x,
envir,
method.p,
method.envir,
sqrt.p = TRUE,
...,
return.model = FALSE
)
FUN.RDA(x, envir, return.model = FALSE)
## S3 method for class 'pcpssig'
print(x, ...)
Arguments
comm |
Community data, with species as columns and sampling units as rows. This matrix can contain either presence/absence or abundance data. Alternatively comm can be an object of class metacommunity.data, an alternative way to set all data.frames/matrices. When you use the class metacommunity.data the arguments phylodist and envir must not be specified. See details. |
phylodist |
Matrix containing phylogenetic distances between species. |
envir |
A matrix or data.frame with environmental variables for each community, with variables as columns and sampling units as rows. See Details and Examples. |
checkdata |
Logical argument (TRUE or FALSE) to check if species sequence in the community data follows the same order as the one in the phylodist matrix and if sampling units in the community data follows the same order as the one in the environmental data (Default checkdata = TRUE). |
FUN |
An object of class function to perform the analysis. See Details and Examples. |
runs |
Number of permutations for assessing significance. |
parallel |
Number of parallel processes or a predefined socket cluster done with parallel package. Tip: use detectCores() (Default parallel = NULL). |
newname |
New name to be replaced in object returned by |
... |
Other arguments passed to FUN function. See Details and Examples. |
method |
Dissimilarity index, as accepted by |
squareroot |
Logical argument (TRUE or FALSE) to specify if use square root of dissimilarity index (Default squareroot = TRUE). |
choices |
Numeric vector to choose the PCPS used in analysis. See Details and Examples. |
x |
An object of class pcpssig or other object to apply the function passed by FUN. See Details. |
method.p |
Resemblance index between communities based on P matrix, as accepted by |
sqrt.p |
Logical argument (TRUE or FALSE) to specify if use square root of dissimilarity P matrix. Used in FUN.MANTEL, FUN.ADONIS, FUN.ADONIS2.global and FUN.ADONIS2.margin analysis. See Details and Examples (Default sqrt.p = TRUE). |
formula |
An object of class |
return.model |
Must not be specified. See Details. |
method.envir |
Resemblance index between communities based on environmental variables, as accepted by |
Details
Each metacommunity is submitted to phylogenetic fuzzy weighting, generating a matrix
that describing the phylogeny-weighted species composition of the communities
(matrix.p
). The function matrix.p.sig test directly the association
this matrix with the environmental predictors. The pairwise dissimilarities are
submitted to Mantel test (mantel
) or ADONIS test (adonis
or adonis2
)
to evaluate the influence of an environmental gradient on species dispersion across
the communities. The function pcps.sig generates principal coordinates of phylogenetic
structure (pcps
) and use a single axis for run a generalized linear
model (GLM, glm
), linear model using generalized least squares (GLS, gls
),
linear mixed-effects models (LME, lme
) or use set of axis for run a distance-based redundancy
analysis (db-RDA, rda
).
The sequence species show up in the community data matrix must be the
same as they show up in the phylogenetic distance matrix and, similarly,
the sequence of communities in the community data matrix must be the same as that in
the environmental data. The function organize.pcps
organizes the data, placing the matrices of
community, phylogenetic distance and environmental data in the same order. The function use of function
organize.pcps is not requered for run the functions, but is recommended. In this way
the arguments comm and phylodist can be specified them as normal arguments or by passing
them with the object returned by the function organize.pcps
using, in this
case only the argument comm. Using the object returned by organize.pcps, the comm argument
is used as an alternative way of entering to set all data.frames/matrices, and therefore
the arguments phylodist and envir must not be specified.
The significance is obtained via two null models, one that shuffles sites across the
environmental gradient and another that shuffles terminal tips (taxa) across the phylogenetic
tree. The first null model (site shuffle) shuffles the site position across the environmental
gradient and rerun the same model, generating a null F value (or r value in Mantel test). The
second null model (taxa shuffle), shuffles terminal tips across the phylogenetic tree and
generates a null matrix containing phylogeny-weighted species composition and rerun the same
model, generating another null F value. In the pcps.sig function are generate set of null PCPS
and each null PCPS (or set of PCPS in RDA) is submitted to a procrustean adjustment
(see procrustes
), and the fitted values between observed PCPS and null PCPS is
obtained. The adjusted null PCPS is used to rerun the model, generating another null F value.
The observed F value (or r value) is compared independently with both null sets of F values
(or r value) to generate a probability value of the original F value being generated merely by
chance according to each null model.
The argument FUN
The type of analysis performed by this function is specified using the argument FUN. The current version of package includes ten predefined function, however additional small functions can be easy specify. All this function uses the environmental variables to analyze the association between phylogeny-weighted species composition and environmental predictors. For matrix P analysis, in matrix.p.sig function, the predefined functions available are FUN.MANTEL, FUN.ADONIS, FUN.ADONIS2.global and FUN.ADONIS2.margin. For PCPS analysis, in pcps.sig function, the predefined functions available are FUN.GLM, FUN.RDA, FUN.GLS.marginal, FUN.GLS.sequential, FUN.LME.marginal and FUN.LME.sequential. The significance for each null model is performed as described here, NOT using p value of basic functions.
FUN.MANTEL
Mantel test that can be used in matrix P analysis. The arguments method.p and sqrt.p are specified for determine resemblance
index between communities based on P matrix. The argument method.envir is specified to determine resemblance
index between communities based on environmental variables. The significance is assess using r value, see more in mantel
.
FUN.ADONIS
Multivariate analysis of variance that can be used in matrix P analysis. The arguments method.p and sqrt.p are specified for determine resemblance
index between communities based on P matrix. The argument formula is specified, where the left hand side gives
the resemblance data, right hand side gives the variables. The resemblance data is internally named p.dist,
thus formula is an expression of the form p.dist ~ model (see Examples). The significance is assess using overall F value,
see more in adonis
.
FUN.ADONIS2.global and FUN.ADONIS2.margin
Multivariate analysis of variance that can be used in matrix P analysis. The arguments method.p and sqrt.p are specified for determine resemblance
index between communities based on P matrix. The argument formula is specified, where the left hand side gives
the resemblance data, right hand side gives the variables. The resemblance data is internally named p.dist,
thus formula is an expression of the form p.dist ~ model (see Examples). The significance is assess using F value
and the difference between function is due to the argument by in adonis2
. The function
FUN.ADONIS2.global use as default by = NULL to assess the overall significance of all terms together
whereas the function FUN.ADONIS2.margin use as default by = margin to assess the marginal effects of
the terms and return F and p value for each term. See more in adonis2
.
The function adonis2
evaluate the formula argument in the global environment, however CRAN
do not allow assignments to the global environment. As a temporary workaround, copy and run the lines below to make
the functions FUN.ADONIS2.global and FUN.ADONIS2.margin available.
FUN.ADONIS2.global <- function(x, envir, method.p, formula, sqrt.p = TRUE, return.model = FALSE){ p.dist <- vegan::vegdist(x, method = method.p) if(sqrt.p){ p.dist <- sqrt(p.dist) } assign("p.dist", p.dist, envir = globalenv()) mod.obs <- vegan::adonis2(formula, data = data.frame(envir), permutations = 0, by = NULL, parallel = NULL) rm(p.dist, envir = globalenv()) statistic.obs <- mod.obs$F[1] if(return.model){ res <- list() res$mod.obs <- mod.obs res$statistic.obs <- statistic.obs } else{ res <- statistic.obs } return(res) } FUN.ADONIS2.margin <- function(x, envir, method.p, formula, sqrt.p = TRUE, return.model = FALSE){ p.dist <- vegan::vegdist(x, method = method.p) if(sqrt.p){ p.dist <- sqrt(p.dist) } assign("p.dist", p.dist, envir = globalenv()) mod.obs <- vegan::adonis2(formula, data = data.frame(envir), permutations = 2, by = "margin", parallel = NULL) rm(p.dist, envir = globalenv()) nf <- length(mod.obs$F)-2 statistic.obs <- mod.obs$F[seq_len(nf)] if(return.model){ res <- list() res$mod.obs <- mod.obs res$statistic.obs <- statistic.obs } else{ res <- statistic.obs } return(res) }
FUN.GLM
Generalized linear models that can be used in PCPS analysis. The argument formula is specified, where the left hand side gives the PCPS used,
right hand side gives the variables. The PCPS are internally named sequentially pcps.1, pcps.2, pcps.3 and so
on. Thus, formula is an expression of the form pcps.1 ~ model (see Examples). The type of environmental variables are
extracted directly from envir argument, thus variables of class factor
can be already
specified in envir data.frame
or through formula argument. The significance is assess using overall
F value, see more in glm
.
FUN.RDA
Redundancy analysis that can be used in PCPS analysis. The RDA analysis is performed using all PCPS specified with choices argument and
all environmental variables specified by envir argument. The significance is assess using overall
F value, see more in rda
.
FUN.GLS.marginal and FUN.GLS.sequential
Linear model using generalized least squares that can be used in PCPS analysis. The argument formula is specified, where the left hand side gives the PCPS used,
right hand side gives the variables. The PCPS are internally named sequentially pcps.1, pcps.2, pcps.3 and so
on. Thus, formula is an expression of the form pcps.1 ~ model (see Examples). The type of environmental variables are
extracted directly from envir argument, thus variables of class factor
can be already
specified in envir data.frame
or through formula argument. The significance is assess using F value
and the difference between function is due to the argument type in anova.gls
. The function
FUN.GLS.marginal use as default type = marginal to assess the marginal significance of all terms
whereas the function FUN.GSL.sequential use as default type = sequential to assess the sequential effects of
the terms. Those funcitons return all F values calculed by anova.gls
, including the intercept if it is in the model.
Additional arguments as correlation can be passed by ... argument. See more in gls
and anova.gls
.
FUN.LME.marginal and FUN.LME.sequential
Linear mixed-effects models that can be used in PCPS analysis. The argument formula is specified, where the left hand side gives the PCPS used,
right hand side gives the variables. The PCPS are internally named sequentially pcps.1, pcps.2, pcps.3 and so
on. Thus, formula is an expression of the form pcps.1 ~ model (see Examples). The type of environmental variables are
extracted directly from envir argument, thus variables of class factor
can be already
specified in envir data.frame
or through formula argument. The significance is assess using F value
and the difference between function is due to the argument type in anova.lme
. The function
FUN.LME.marginal use as default type = marginal to assess the marginal significance of all terms
whereas the function FUN.LME.sequential use as default type = sequential to assess the sequential effects of
the terms. Those funcitons return all F values calculed by anova.lme
, including the intercept if it is in the model.
Additional arguments as correlation and random can be passed by ... argument. See more in lme
and anova.lme
.
Additional function
The functions matrix.p.sig and pcps.sig only perform permutation following null models and apply the functions in all permuted matrices. Additional functions can be easy specify and passed via FUN argument. A skeleton of this function is slowed below. In this function the argument x will be always the matrix P or one matrix with PCPS choose, when additional arguments as envir will specify statistical analysis performed in matrix P ou PCPS. This function must return the observed statistical in addition the return.model argument must not be specified because it specify the return options used for observed and null statistics.
FUN.X <- function(x, envir, ..., return.model = FALSE){ mod.obs <- # Function to perform analysis using x, envir and any additional argument statistic.obs <- # Extract only the numeric values of observed statistical # Next lines are mandatory if(return.model){ res <- list() res$mod.obs <- mod.obs res$statistic.obs <- statistic.obs } else{ res <- statistic.obs } return(res) }
Value
call |
The arguments used. |
P.obs |
Phylogeny-weighted species composition matrix. |
PCPS.obs |
The principal coordinates of phylogenetic structure (PCPS) |
model |
The observed model returned by FUN, an object of class glm, gls, lme, rda, adonis, adonis2 or mantel to predefined function. |
fun |
The funtion used. |
statistic.null.site |
A matrix with null statistic for site shuffle null model. |
statistic.null.taxa |
A matrix with null statistic for taxa shuffle null model. |
obs.statistic |
Observed statistic, F value or r value to predefined function. |
p.site.shuffle |
The p value for the site shuffle null model. |
p.taxa.shuffle |
The p value for the taxa shuffle null model. |
Note
IMPORTANT: The sequence of species in the community data matrix
MUST be the same as that in the phylogenetic distance matrix and, similarly,
the sequence of communities in the community data matrix MUST be the same as that in
the environmental data. See details and organize.pcps
.
Author(s)
Vanderlei Julio Debastiani <vanderleidebastiani@yahoo.com.br>
References
Duarte, L.S. (2011). Phylogenetic habitat filtering influences forest nucleation in grasslands. Oikos, 120, 208:215.
Duarte, L.S. (2016). Dissecting phylogenetic fuzzy weighting: theory and application in metacommunity phylogenetics. Methods in Ecology and Evolution, 7(8), 937:946.
See Also
matrix.p
, pcps
, procrustes
,
glm
, rda
, adonis
, adonis2
,
mantel
Examples
## Not run:
data(flona)
# MANTEL
res <- matrix.p.sig(flona$community,flona$phylo, FUN = FUN.MANTEL, method.p = "bray",
method.envir = "euclidean", envir = flona$environment[, 2, drop = FALSE], runs = 99)
res
# ADONIS
res <- matrix.p.sig(flona$community,flona$phylo, FUN = FUN.ADONIS, method.p = "bray",
formula = p.dist~temp, envir = flona$environment[, 2, drop = FALSE], runs = 99)
res
# ADONIS2
res <- matrix.p.sig(flona$community,flona$phylo, FUN = FUN.ADONIS2.global,
envir = flona$environment, formula = p.dist~temp+alt,
method.p = "bray", runs = 99)
res
res <- matrix.p.sig(flona$community,flona$phylo, FUN = FUN.ADONIS2.margin,
envir = flona$environment, formula = p.dist~temp+alt,
method.p = "bray", runs = 99)
res
# GLM
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.GLM, method = "bray",
formula = pcps.1~temp, envir = flona$environment, choices = 1, runs = 99)
res
summary.lm(res$model)
# RDA
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.RDA, envir = flona$environment,
choices = 1:2, runs = 99)
res
# GLS
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.GLS.marginal,
formula = pcps.1~temp, envir = flona$environment, choices = 1, runs = 99)
res
anova(res$model, type = "marginal")
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.GLS.marginal,
formula = pcps.1~temp, envir = flona$environment,
correlation = nlme::corCAR1(form = ~1:39), choices = 1, runs = 99)
res
anova(res$model, type = "marginal")
# LME
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.LME.marginal, formula = pcps.1~alt,
envir = flona$environment, random = ~1|temp, choices = 1, runs = 99)
res
anova(res$model, type = "marginal")
res <- pcps.sig(flona$community, flona$phylo, FUN = FUN.LME.sequential, formula = pcps.1~alt,
envir = flona$environment, random = ~1|temp, choices = 1, runs = 99)
res
anova(res$model, type = "sequential")
## End(Not run)