betadisper {vegan} | R Documentation |
Multivariate homogeneity of groups dispersions (variances)
Description
Implements Marti Anderson's PERMDISP2 procedure for the analysis of
multivariate homogeneity of group dispersions (variances).
betadisper
is a multivariate analogue of Levene's test for
homogeneity of variances. Non-euclidean distances between objects and
group centres (centroids or medians) are handled by reducing the
original distances to principal coordinates. This procedure has
latterly been used as a means of assessing beta diversity. There are
anova
, scores
, plot
and boxplot
methods.
TukeyHSD.betadisper
creates a set of confidence intervals on
the differences between the mean distance-to-centroid of the levels of
the grouping factor with the specified family-wise probability of
coverage. The intervals are based on the Studentized range statistic,
Tukey's 'Honest Significant Difference' method.
Usage
betadisper(d, group, type = c("median","centroid"), bias.adjust = FALSE,
sqrt.dist = FALSE, add = FALSE)
## S3 method for class 'betadisper'
anova(object, ...)
## S3 method for class 'betadisper'
scores(x, display = c("sites", "centroids"),
choices = c(1,2), ...)
## S3 method for class 'betadisper'
eigenvals(x, ...)
## S3 method for class 'betadisper'
plot(x, axes = c(1,2), cex = 0.7,
pch = seq_len(ng), col = NULL, lty = "solid", lwd = 1, hull = TRUE,
ellipse = FALSE, conf,
segments = TRUE, seg.col = "grey", seg.lty = lty, seg.lwd = lwd,
label = TRUE, label.cex = 1,
ylab, xlab, main, sub, ...)
## S3 method for class 'betadisper'
boxplot(x, ylab = "Distance to centroid", ...)
## S3 method for class 'betadisper'
TukeyHSD(x, which = "group", ordered = FALSE,
conf.level = 0.95, ...)
## S3 method for class 'betadisper'
print(x, digits = max(3, getOption("digits") - 3),
neigen = 8, ...)
Arguments
d |
a distance structure such as that returned by
|
group |
vector describing the group structure, usually a factor
or an object that can be coerced to a factor using
|
type |
the type of analysis to perform. Use the spatial median or the group centroid? The spatial median is now the default. |
bias.adjust |
logical: adjust for small sample bias in beta diversity estimates? |
sqrt.dist |
Take square root of dissimilarities. This often euclidifies dissimilarities. |
add |
Add a constant to the non-diagonal dissimilarities such
that all eigenvalues are non-negative in the underlying Principal
Co-ordinates Analysis (see |
display |
character; partial match to access scores for
|
object , x |
an object of class |
choices , axes |
the principal coordinate axes wanted. |
hull |
logical; should the convex hull for each group be plotted? |
ellipse |
logical; should the standard deviation data ellipse for each group be plotted? |
conf |
Expected fractions of data coverage for data ellipses,
e.g. 0.95. The default is to draw a 1 standard deviation data
ellipse, but if supplied, |
pch |
plot symbols for the groups, a vector of length equal to the number of groups. |
col |
colors for the plot symbols and centroid labels for the groups, a vector of length equal to the number of groups. |
lty , lwd |
linetype, linewidth for convex hulls and confidence ellipses. |
segments |
logical; should segments joining points to their centroid be drawn? |
seg.col |
colour to draw segments between points and their centroid. Can be a vector, in which case one colour per group. |
seg.lty , seg.lwd |
linetype and line width for segments. |
label |
logical; should the centroids by labelled with their respective factor label? |
label.cex |
numeric; character expansion for centroid labels. |
cex , ylab , xlab , main , sub |
graphical parameters. For details,
see |
which |
A character vector listing terms in the fitted model for which the intervals should be calculated. Defaults to the grouping factor. |
ordered |
logical; see |
conf.level |
A numeric value between zero and one giving the family-wise confidence level to use. |
digits , neigen |
numeric; for the |
... |
arguments, including graphical parameters (for
|
Details
One measure of multivariate dispersion (variance) for a group of samples is to calculate the average distance of group members to the group centroid or spatial median (both referred to as 'centroid' from now on unless stated otherwise) in multivariate space. To test if the dispersions (variances) of one or more groups are different, the distances of group members to the group centroid are subject to ANOVA. This is a multivariate analogue of Levene's test for homogeneity of variances if the distances between group members and group centroids is the Euclidean distance.
However, better measures of distance than the Euclidean distance are available for ecological data. These can be accommodated by reducing the distances produced using any dissimilarity coefficient to principal coordinates, which embeds them within a Euclidean space. The analysis then proceeds by calculating the Euclidean distances between group members and the group centroid on the basis of the principal coordinate axes rather than the original distances.
Non-metric dissimilarity coefficients can produce principal coordinate axes that have negative Eigenvalues. These correspond to the imaginary, non-metric part of the distance between objects. If negative Eigenvalues are produced, we must correct for these imaginary distances.
The distance to its centroid of a point is
z_{ij}^c =
\sqrt{\Delta^2(u_{ij}^+, c_i^+) - \Delta^2(u_{ij}^-, c_i^-)},
where
\Delta^2
is the squared Euclidean distance between
u_{ij}
, the principal coordinate for the j
th
point in the i
th group, and c_i
, the
coordinate of the centroid for the i
th group. The
super-scripted ‘+
’ and ‘-
’ indicate the
real and imaginary parts respectively. This is equation (3) in
Anderson (2006). If the imaginary part is greater in magnitude than
the real part, then we would be taking the square root of a negative
value, resulting in NaN, and these cases are changed to zero distances
(with a warning). This is in line with the behaviour of Marti Anderson's
PERMDISP2 programme.
To test if one or more groups is more variable than the others, ANOVA
of the distances to group centroids can be performed and parametric
theory used to interpret the significance of F
. An alternative is to
use a permutation test. permutest.betadisper
permutes model
residuals to generate a permutation distribution of F
under the Null
hypothesis of no difference in dispersion between groups.
Pairwise comparisons of group mean dispersions can also be performed
using permutest.betadisper
. An alternative to the classical
comparison of group dispersions, is to calculate Tukey's Honest
Significant Differences between groups, via
TukeyHSD.betadisper
. This is a simple wrapper to
TukeyHSD
. The user is directed to read the help file
for TukeyHSD
before using this function. In particular,
note the statement about using the function with
unbalanced designs.
The results of the analysis can be visualised using the plot
and boxplot
methods.
One additional use of these functions is in assessing beta diversity
(Anderson et al 2006). Function betadiver
provides some popular dissimilarity measures for this purpose.
As noted in passing by Anderson (2006) and in a related
context by O'Neill (2000), estimates of dispersion around a
central location (median or centroid) that is calculated from the same data
will be biased downward. This bias matters most when comparing diversity
among treatments with small, unequal numbers of samples. Setting
bias.adjust=TRUE
when using betadisper
imposes a
\sqrt{n/(n-1)}
correction (Stier et al. 2013).
Value
The anova
method returns an object of class "anova"
inheriting from class "data.frame"
.
The scores
method returns a list with one or both of the
components "sites"
and "centroids"
.
The plot
function invisibly returns an object of class
"ordiplot"
, a plotting structure which can be used by
identify.ordiplot
(to identify the points) or other
functions in the ordiplot
family.
The boxplot
function invisibly returns a list whose components
are documented in boxplot
.
eigenvals.betadisper
returns a named vector of eigenvalues.
TukeyHSD.betadisper
returns a list. See TukeyHSD
for further details.
betadisper
returns a list of class "betadisper"
with the
following components:
eig |
numeric; the eigenvalues of the principal coordinates analysis. |
vectors |
matrix; the eigenvectors of the principal coordinates analysis. |
distances |
numeric; the Euclidean distances in principal coordinate space between the samples and their respective group centroid or median. |
group |
factor; vector describing the group structure |
centroids |
matrix; the locations of the group centroids or medians on the principal coordinates. |
group.distances |
numeric; the mean distance to each group centroid or median. |
call |
the matched function call. |
Warning
Stewart Schultz noticed that the permutation test for
type="centroid"
had the wrong type I error and was
anti-conservative. As such, the default for type
has been
changed to "median"
, which uses the spatial median as the group
centroid. Tests suggests that the permutation test for this type of
analysis gives the correct error rates.
Note
If group
consists of a single level or group, then the
anova
and permutest
methods are not appropriate and if
used on such data will stop with an error.
Missing values in either d
or group
will be removed
prior to performing the analysis.
Author(s)
Gavin L. Simpson; bias correction by Adrian Stier and Ben Bolker.
References
Anderson, M.J. (2006) Distance-based tests for homogeneity of multivariate dispersions. Biometrics 62, 245–253.
Anderson, M.J., Ellingsen, K.E. & McArdle, B.H. (2006) Multivariate dispersion as a measure of beta diversity. Ecology Letters 9, 683–693.
O'Neill, M.E. (2000) A Weighted Least Squares Approach to Levene's Test of Homogeneity of Variance. Australian & New Zealand Journal of Statistics 42, 81-–100.
Stier, A.C., Geange, S.W., Hanson, K.M., & Bolker, B.M. (2013) Predator density and timing of arrival affect reef fish community assembly. Ecology 94, 1057–1068.
See Also
permutest.betadisper
, anova.lm
,
scores
, boxplot
,
TukeyHSD
. Further measure of beta diversity
can be found in betadiver
.
Examples
data(varespec)
## Bray-Curtis distances between samples
dis <- vegdist(varespec)
## First 16 sites grazed, remaining 8 sites ungrazed
groups <- factor(c(rep(1,16), rep(2,8)), labels = c("grazed","ungrazed"))
## Calculate multivariate dispersions
mod <- betadisper(dis, groups)
mod
## Perform test
anova(mod)
## Permutation test for F
permutest(mod, pairwise = TRUE, permutations = 99)
## Tukey's Honest Significant Differences
(mod.HSD <- TukeyHSD(mod))
plot(mod.HSD)
## Plot the groups and distances to centroids on the
## first two PCoA axes
plot(mod)
## with data ellipses instead of hulls
plot(mod, ellipse = TRUE, hull = FALSE) # 1 sd data ellipse
plot(mod, ellipse = TRUE, hull = FALSE, conf = 0.90) # 90% data ellipse
# plot with manual colour specification
my_cols <- c("#1b9e77", "#7570b3")
plot(mod, col = my_cols, pch = c(16,17), cex = 1.1)
## can also specify which axes to plot, ordering respected
plot(mod, axes = c(3,1), seg.col = "forestgreen", seg.lty = "dashed")
## Draw a boxplot of the distances to centroid for each group
boxplot(mod)
## `scores` and `eigenvals` also work
scrs <- scores(mod)
str(scrs)
head(scores(mod, 1:4, display = "sites"))
# group centroids/medians
scores(mod, 1:4, display = "centroids")
# eigenvalues from the underlying principal coordinates analysis
eigenvals(mod)
## try out bias correction; compare with mod3
(mod3B <- betadisper(dis, groups, type = "median", bias.adjust=TRUE))
anova(mod3B)
permutest(mod3B, permutations = 99)
## should always work for a single group
group <- factor(rep("grazed", NROW(varespec)))
(tmp <- betadisper(dis, group, type = "median"))
(tmp <- betadisper(dis, group, type = "centroid"))
## simulate missing values in 'd' and 'group'
## using spatial medians
groups[c(2,20)] <- NA
dis[c(2, 20)] <- NA
mod2 <- betadisper(dis, groups) ## messages
mod2
permutest(mod2, permutations = 99)
anova(mod2)
plot(mod2)
boxplot(mod2)
plot(TukeyHSD(mod2))
## Using group centroids
mod3 <- betadisper(dis, groups, type = "centroid")
mod3
permutest(mod3, permutations = 99)
anova(mod3)
plot(mod3)
boxplot(mod3)
plot(TukeyHSD(mod3))