ordr {ordr} | R Documentation |
ordr package
Description
This is a tidyverse extension for handling, manipulating, and visualizing ordination models with consistent conventions and in a tidy workflow.
Details
This package is designed to integrate ordination analysis and biplot visualization into a tidyverse workflow. It is inspired in particular by the extensions ggbiplot and tidygraph.
The package consists in several modules:
the 'tbl_ord' class, a wrapper for various ordination object classes
extracting augmentation for the factors of an ordination
using dplyr-verbs to add annotation to the factors
adjusting the conference of inertia between the factors
methods of the above generics for several widely-used object classes
convenient formatting of ordination objects
-
ggbiplot()
, a ggplot2 extension for rendering biplots
Ordinations and biplots
Ordination encompasses a variety of techniques for data compression, dimension reduction, feature extraction, and visualization. Well-known ordination techniques are predominantly unsupervised and include principal components analysis, multidimensional scaling, and correspondence analyis (Podani, 2000, Chapter 7; Palmer, n.d.). These methods are theoretically grounded in geometric data analysis (Le Roux & Rouanet, 2004) and powered by the matrix factorizations described below. A variety of other techniques may also be viewed, or treated using the same tools, as ordination, including linear regression, linear discriminant analysis, k-means clustering, and non-negative matrix factorization.
Biplots are two-layered scatterplots widely used to visualize unsupervised SVD-based ordinations. Gabriel (1971) introduced biplots to represent the scores and loadings of PCA on a single set of axes. They have also been used to visualize generalized linear regression and linear discriminant analysis (Greenacre, 2010) and can adapted to any 2-factor matrix decomposition.
Singular value decomposition
The most popular ordination techniques use singular value decomposition (SVD)
to factor a data matrix X
into a product X=UDV'
of two orthogonal
(rotation) matrices U
and V
and a diagonal (scaling) matrix
D
, with V'
the transpose of V
. In most cases, the data
matrix X
is transformed from an original data matrix, e.g. by
centering, scaling, double-centering, or log-transforming. The SVD introduces
a set of shared orthogonal coordinates in which U
encodes the rows of
X
and V
encodes the columns of X
. The singular values in
D
are the variances of X
along each of these coordinates, and
they proceed in decreasing order, so that the first r
(for "rank")
columns of U
and of V
produce a geometrically optimized
approximation to X
.
Biplots of SVD-based ordinations usually plot the rows and columns of X
on these r
coordinate axes. For an SVD-based biplot to be truly
geometric, the total variance contained in D
must be conferred onto
U
or V
, or distributed over both (Orlov, 2015). When D
is
conferred onto U
, the rows of X
are represented by the rows of
UD
, and their distances in the biplot approximate their distances in
the original column space of X
. Meanwhile, the columns of X
are
represented by the rows of V
. These are unit vectors in the full space
of shared coordinates, so their squared lengths in the biplot indicate the
proportion of their variance captured by the biplot axes and their cosines
with each other approximate the correlations between the columns. Finally,
the projection of a row's coordinates (point) onto a column's coordinates
(vector) approximates the corresponding entry of X
.
References
Podani J (2000) "Ordination". Introduction to the Exploration of Multivariate Biological Data Chapter 7, 215–284. Backhuys Publishers, ISBN 90-5782-067-6. https://web.archive.org/web/20200221000313/http://ramet.elte.hu/~podani/books.html
Palmer M Ordination Methods for Ecologists. Website, accessed 2019-07-12. http://ordination.okstate.edu/
Le Roux B & Rouanet H (2004) Geometric Data Analysis: From Correspondence Analysis to Stsructured Data Analysis. Springer Dordrecht, ISBN: 978-1-4020-2236-4. doi:10.1007/1-4020-2236-0 https://link.springer.com/book/10.1007/1-4020-2236-0
Gabriel KR (1971) "The biplot graphic display of matrices with application to principal component analysis". Biometrika 58(3), 453–467. doi:10.1093/biomet/58.3.453
Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html
Orlov K (2015) Answer to "PCA and Correspondence analysis in their relation to Biplot". CrossValidated, accessed 2019-07-12. https://stats.stackexchange.com/a/141755/68743