ordr {ordr}R Documentation

ordr package

Description

This is a tidyverse extension for handling, manipulating, and visualizing ordination models with consistent conventions and in a tidy workflow.

Details

This package is designed to integrate ordination analysis and biplot visualization into a tidyverse workflow. It is inspired in particular by the extensions ggbiplot and tidygraph.

The package consists in several modules:

Ordinations and biplots

Ordination encompasses a variety of techniques for data compression, dimension reduction, feature extraction, and visualization. Well-known ordination techniques are predominantly unsupervised and include principal components analysis, multidimensional scaling, and correspondence analyis (Podani, 2000, Chapter 7; Palmer, n.d.). These methods are theoretically grounded in geometric data analysis (Le Roux & Rouanet, 2004) and powered by the matrix factorizations described below. A variety of other techniques may also be viewed, or treated using the same tools, as ordination, including linear regression, linear discriminant analysis, k-means clustering, and non-negative matrix factorization.

Biplots are two-layered scatterplots widely used to visualize unsupervised SVD-based ordinations. Gabriel (1971) introduced biplots to represent the scores and loadings of PCA on a single set of axes. They have also been used to visualize generalized linear regression and linear discriminant analysis (Greenacre, 2010) and can adapted to any 2-factor matrix decomposition.

Singular value decomposition

The most popular ordination techniques use singular value decomposition (SVD) to factor a data matrix XX into a product X=UDVX=UDV' of two orthogonal (rotation) matrices UU and VV and a diagonal (scaling) matrix DD, with VV' the transpose of VV. In most cases, the data matrix XX is transformed from an original data matrix, e.g. by centering, scaling, double-centering, or log-transforming. The SVD introduces a set of shared orthogonal coordinates in which UU encodes the rows of XX and VV encodes the columns of XX. The singular values in DD are the variances of XX along each of these coordinates, and they proceed in decreasing order, so that the first rr (for "rank") columns of UU and of VV produce a geometrically optimized approximation to XX.

Biplots of SVD-based ordinations usually plot the rows and columns of XX on these rr coordinate axes. For an SVD-based biplot to be truly geometric, the total variance contained in DD must be conferred onto UU or VV, or distributed over both (Orlov, 2015). When DD is conferred onto UU, the rows of XX are represented by the rows of UDUD, and their distances in the biplot approximate their distances in the original column space of XX. Meanwhile, the columns of XX are represented by the rows of VV. These are unit vectors in the full space of shared coordinates, so their squared lengths in the biplot indicate the proportion of their variance captured by the biplot axes and their cosines with each other approximate the correlations between the columns. Finally, the projection of a row's coordinates (point) onto a column's coordinates (vector) approximates the corresponding entry of XX.

References

Podani J (2000) "Ordination". Introduction to the Exploration of Multivariate Biological Data Chapter 7, 215–284. Backhuys Publishers, ISBN 90-5782-067-6. https://web.archive.org/web/20200221000313/http://ramet.elte.hu/~podani/books.html

Palmer M Ordination Methods for Ecologists. Website, accessed 2019-07-12. http://ordination.okstate.edu/

Le Roux B & Rouanet H (2004) Geometric Data Analysis: From Correspondence Analysis to Stsructured Data Analysis. Springer Dordrecht, ISBN: 978-1-4020-2236-4. doi:10.1007/1-4020-2236-0 https://link.springer.com/book/10.1007/1-4020-2236-0

Gabriel KR (1971) "The biplot graphic display of matrices with application to principal component analysis". Biometrika 58(3), 453–467. doi:10.1093/biomet/58.3.453

Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html

Orlov K (2015) Answer to "PCA and Correspondence analysis in their relation to Biplot". CrossValidated, accessed 2019-07-12. https://stats.stackexchange.com/a/141755/68743


[Package ordr version 0.1.1 Index]