cpr_rand_test {canaper} | R Documentation |
Run a randomization analysis for one or more biodiversity metrics
Description
The observed value of the biodiversity metric(s) will be calculated for the input community data, then compared against a set of random communities. Various statistics are calculated from the comparison (see Value below).
Usage
cpr_rand_test(
comm,
phy,
null_model,
n_reps = 100,
n_iterations = 10000,
thin = 1,
metrics = c("pd", "rpd", "pe", "rpe"),
site_col = "site",
tbl_out = tibble::is_tibble(comm),
quiet = FALSE
)
Arguments
comm |
Dataframe, tibble, or matrix; input community data with sites (communities) as rows and species as columns. Either presence-absence data (values only 0s or 1s) or abundance data (values >= 0) accepted, but calculations do not use abundance-weighting, so results from abundance data will be the same as if converted to presence-absence before analysis. |
phy |
List of class |
null_model |
Character vector of length 1 or object of class |
n_reps |
Numeric vector of length 1; number of random communities to replicate. |
n_iterations |
Numeric vector of length 1; number of iterations to use for sequential null models; ignored for non-sequential models. |
thin |
Numeric vector of length 1; thinning parameter used by some
null models in |
metrics |
Character vector; names of biodiversity metrics to calculate.
May include one or more of: |
site_col |
Character vector of length 1; name of column in |
tbl_out |
Logical vector of length 1; should the output be returned as
a tibble? If |
quiet |
Logical vector of length 1; if |
Details
The biodiversity metrics (metrics
) available for analysis include:
-
pd
: Phylogenetic diversity (Faith 1992) -
rpd
: Relative phylogenetic diversity (Mishler et al 2014) -
pe
: Phylogenetic endemism (Rosauer et al 2009) -
rpe
: Relative phylogenetic endemism (Mishler et al 2014)
(pe
and rpe
are needed for CANAPE with
cpr_classify_endem()
)
The choice of a randomization algorithm (null_model
) is not trivial, and
may strongly affect results. cpr_rand_test()
uses null models provided by
vegan
; for a complete list, see the help file of vegan::commsim()
or run
vegan::make.commsim()
. One frequently used null model is swap
(Gotelli &
Entsminger 2003), which randomizes the community matrix while preserving
column and row sums (marginal sums). For a review of various null models, see
Strona et al. (2018); swap
is an "FF" model in the sense of Strona et al.
(2018).
Instead of using one of the predefined null models in vegan::commsim()
, it
is also possible to define a custom null model; see Examples in
cpr_rand_comm()
Note that the pre-defined models in vegan
include binary models (designed
for presence-absence data) and quantitative models (designed for abundance
data). Although the binary models will accept abundance data, they treat it
as binary and always return a binary (presence-absence) matrix. The PD and PE
calculations in canaper
are not abundance-weighted, so they return the same
result regardless of whether the input is presence-absence or abundance. In
that sense, binary null models are appropriate for cpr_rand_test()
. The
quantitative models could also be used for abundance data, but the output
will be treated as binary anyways when calculating PD and PE. The effects of
using binary vs. quantitative null models for cpr_rand_test()
have not been
investigated.
A minimum of 5 species and sites are required as input; fewer than that is
likely cause the some randomization algorithms (e.g., swap
) to enter an
infinite loop. Besides, inferences on very small numbers of species and/or
sites is not recommended generally.
The following rules apply to comm
input:
If dataframe or matrix, must include row names (site names) and column names (species names).
If tibble, a single column (default,
site
) must be included with site names, and other columns must correspond to species names.Column names cannot start with a number and must be unique.
Row names (site names) must be unique.
Values (other than site names) should only include integers >= 0; non-integer input will be converted to integer.
The results are identical regardless of whether the input for comm
is
abundance or presence-absence data (i.e., abundance weighting is not used).
Value
Dataframe. For each of the biodiversity metrics, the following 9 columns will be produced:
-
*_obs
: Observed value -
*_obs_c_lower
: Count of times observed value was lower than random values -
*_obs_c_upper
: Count of times observed value was higher than random values -
*_obs_p_lower
: Percentage of times observed value was lower than random values -
*_obs_p_upper
: Percentage of times observed value was higher than random values -
*_obs_q
: Count of the non-NA random values used for comparison -
*_obs_z
: Standard effect size (z-score) -
*_rand_mean
: Mean of the random values -
*_rand_sd
: Standard deviation of the random values
So if you included pd
in metrics
, the output columns would include
pd_obs
, pd_obs_c_lower
, etc...
References
Faith DP (1992) Conservation evaluation and phylogenetic diversity. Biological Conservation, 61:1–10. doi:10.1016/0006-3207(92)91201-3
Gotelli, N.J. and Entsminger, N.J. (2003). Swap algorithms in null model analysis. Ecology 84, 532–535.
Mishler, B., Knerr, N., González-Orozco, C. et al. (2014) Phylogenetic measures of biodiversity and neo- and paleo-endemism in Australian Acacia. Nat Commun, 5: 4473. doi:10.1038/ncomms5473
Rosauer, D., Laffan, S.W., Crisp, M.D., Donnellan, S.C. and Cook, L.G. (2009) Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology, 18: 4061-4072. doi:10.1111/j.1365-294X.2009.04311.x
Strona, G., Ulrich, W. and Gotelli, N.J. (2018), Bi-dimensional null model analysis of presence-absence binary matrices. Ecology, 99: 103-115. doi:10.1002/ecy.2043
Examples
set.seed(12345)
data(phylocom)
# Returns a dataframe by defualt
cpr_rand_test(
phylocom$comm, phylocom$phy,
null_model = "curveball", metrics = "pd", n_reps = 10
)
# Tibbles may be preferable because of the large number of columns
cpr_rand_test(
phylocom$comm, phylocom$phy,
null_model = "curveball", tbl_out = TRUE, n_reps = 10
)