compositions-package {compositions}R Documentation

Compositional Data Analysis

Description

"compositions" is a package for the analysis of compositional and multivariate positive data (generally called "amounts"), based on several alternative approaches.

Details

The DESCRIPTION file:

Package: compositions
Version: 2.0-8
Date: 2024-01-25
Title: Compositional Data Analysis
Author: K. Gerald van den Boogaart <boogaart@hzdr.de>, Raimon Tolosana-Delgado, Matevz Bren
Maintainer: K. Gerald van den Boogaart <support@boogaart.de>
Depends: R (>= 3.6)
Imports: methods, utils, grDevices, stats, tensorA, robustbase, bayesm, graphics, MASS
Suggests: rgl (>= 1.0.1), combinat, energy, knitr, rmarkdown
Description: Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.
License: GPL (>= 2)
URL: http://www.stat.boogaart.de/compositions/
VignetteBuilder: knitr
RoxygenNote: 7.1.1

Index of help topics:



                        structure
Aar                     Composition of glaciar sediments from the Aar 
                        massif (Switzerland)
acomp                   Aitchison compositions
acompmargin             Marginal compositions in Aitchison Compositions
Activity10              Activity patterns of a statistician for 20 days
Activity31              Activity patterns of a statistician for 20 days
alr                     Additive log ratio transform
AnimalVegetation        Animal and vegetation measurement
+.aplus                 vectorial arithmetic for data sets with aplus
                        class
aplus                   Amounts analysed in log-scale
apt                     Additive planar transform
ArcticLake              Artic lake sediment samples of different water
                        depth
arrows3D                arrows in 3D, based on package rgl
as.data.frame.acomp     Convert "compositions" classes to data frames
axis3D                  Drawing a 3D coordiante system to a plot, based
                        on package rgl
balance                 Compute balances for a compositional dataset.
barplot.acomp           Bar charts of amounts
Bayesite                Permeabilities of bayesite
binary                  Treating binary and g-adic numbers
biplot3D                Three-dimensional biplots, based on package rgl
Blood23                 Blood samples
Boxite                  Compositions and depth of 25 specimens of
                        boxite
boxplot.acomp           Displaying compositions and amounts with
                        box-plots
cdt                     Centered default transform
ClamEast                Color-size compositions of 20 clam colonies
                        from East Bay
ClamWest                Color-size compositions of 20 clam colonies
                        from West Bay
clo                     Closure of a composition
clr                     Centered log ratio transform
clr2ilr                 Convert between clr and ilr, and between cpt
                        and ipt.
ClusterFinder1          Heuristics to find subpopulations of outliers
CoDaDendrogram          Dendrogram representation of acomp or rcomp
                        objects
coloredBiplot           A biplot providing somewhat easier access to
                        details of the plot.
colorsForOutliers1      Create a color/char palette or for groups of
                        outliers
CompLinModCoReg         Compositional Linear Model of Coregionalisation
compOKriging            Compositional Ordinary Kriging
compositions-package    library(compositions)
ConfRadius              Helper to compute confidence ellipsoids
cor.acomp               Correlations of amounts and compositions
Coxite                  Compositions, depths and porosities of 25
                        specimens of coxite
cpt                     Centered planar transform
DiagnosticProb          Diagnostic probabilities
dist                    Distances in variouse approaches
ellipses                Draw ellipses
endmemberCoordinates    Recast amounts as mixtures of end-members
Firework                Firework mixtures
geometricmean           The geometric mean
getDetectionlimit       Gets the detection limit stored in the data set
Glacial                 Compositions and total pebble counts of 92
                        glacial tills
groupparts              Group amounts of parts
Hongite                 Compositions of 25 specimens of hongite
HouseholdExp            Household Expenditures
Hydrochem               Hydrochemical composition data set of Llobregat
                        river basin water (NE Spain)
idt                     Isometric default transform
iit                     Isometric identity transform
ilr                     Isometric log ratio transform
ilrBase                 The canonical basis in the clr plane used for
                        ilr and ipt transforms.
ilt                     Isometric log transform
ipt                     Isometric planar transform
is.acomp                Check for compositional data type
IsMahalanobisOutlier    Checking for outliers
isoPortionLines         Isoportion- and Isoproportion-lines
juraset                 The jura dataset
kingTetrahedron         Ploting composition into rotable tetrahedron
Kongite                 Compositions of 25 specimens of kongite
lines.rmult             Draws connected lines from point to point.
logratioVariogram       Empirical variograms for compositions
MahalanobisDist         Compute Mahalanobis distances based von robust
                        Estimations
mean.acomp              Mean amounts and mean compositions
meanRow                 The arithmetic mean of rows or columns
Metabolites             Steroid metabolite patterns in adults and
                        children
missingProjector        Returns a projector the the observed space in
                        case of missings.
missingsInCompositions
                        The policy of treatment of missing values in
                        the "compositions" package
missingSummary          Classify and summarize missing values in a
                        dataset
mix.2aplus              Transformations from 'mixtures' to
                        'compositions' classes
mix.Read                Reads a data file in a mixR format
mvar                    Metric summary statistics of real, amount or
                        compositional data
names.acomp             The names of the parts
normalize               Normalize vectors to norm 1
norm.default            Vector space norm
oneOrDataset            Treating single compositions as one-row
                        datasets
OutlierClassifier1      Detect and classify compositional outliers.
outlierplot             Plot various graphics to analyse outliers.
outliersInCompositions
                        Analysing outliers in compositions.
pairwisePlot            Creates a paneled plot like pairs for two
                        different datasets.
parametricPosdefMat     Unique parametrisations for matrices.
perturbe                Perturbation of compositions
plot3D                  plot in 3D based on rgl
plot3D.acomp            3D-plot of compositional data
plot3D.aplus            3D-plot of positive data
plot3D.rmult            plot in 3D based on rgl
plot3D.rplus            plot in 3D based on rgl
plot.acomp              Ternary diagrams
plot.aplus              Displaying amounts in scatterplots
plot.logratioVariogram
                        Empirical variograms for compositions
plot.missingSummary     Plot a Missing Summary
pMaxMahalanobis         Compute distributions of empirical Mahalanobis
                        distances based on simulations
PogoJump                Honk Kong Pogo-Jumps Championship
power.acomp             Power transform in the simplex
powerofpsdmatrix        power transform of a matrix
princomp.acomp          Principal component analysis for Aitchison
                        compositions
princomp.aplus          Principal component analysis for amounts in log
                        geometry
princomp.rcomp          Principal component analysis for real
                        compositions
princomp.rmult          Principal component analysis for real data
princomp.rplus          Principal component analysis for real amounts
print.acomp             Printing compositional data.
qHotellingsTsq          Hotellings T square distribution
qqnorm.acomp            Normal quantile plots for compositions and
                        amounts
R2                      R square
rAitchison              Aitchison Distribution
+.rcomp                 Arithmetic operations for compositions in a
                        real geometry
rcomp                   Compositions as elements of the simplex
                        embedded in the D-dimensional real space
rcompmargin             Marginal compositions in real geometry
rDirichlet              Dirichlet distribution
read.geoeas             Reads a data file in a geoeas format
relativeLoadings        Loadings of relations of two amounts
replot                  Modify parameters of compositional plots.
rlnorm.rplus            The multivariate lognormal distribution

+.rmult                 vectorial arithmetic for datasets in a
                        classical vector scale
rmult                   Simple treatment of real vectors
rnorm.acomp             Normal distributions on special spaces
robustnessInCompositions
                        Handling robustness issues and outliers in
                        compositions.
+.rplus                 vectorial arithmetic for data sets with rplus
                        class
rplus                   Amounts i.e. positive numbers analysed as
                        objects of the real vector space
runif.acomp             The uniform distribution on the simplex
scalar                  Parallel scalar products
scale                   Normalizing datasets by centering and scaling
Sediments               Proportions of sand, silt and clay in sediments
                        specimens
segments.rmult          Draws straight lines from point to point.
SerumProtein            Serum Protein compositions of blood samples
ShiftOperators          Shifts of machine operators
simpleMissingSubplot    Ternary diagrams
SimulatedAmounts        Simulated amount datasets
simulateMissings        Artifical simulation of various kinds of
                        missings
Skulls                  Measurement of skulls
SkyeAFM                 AFM compositions of 23 aphyric Skye lavas
split.acomp             Splitting datasets in groups given by factors
straight                Draws straight lines.
summary.acomp           Summarizing a compositional dataset in terms of
                        ratios
summary.aplus           Summaries of amounts
summary.rcomp           Summary of compositions in real geometry
sumMissingProjector     Compute the global projector to the observed
                        subspace.
Supervisor              Proportions of supervisor's statements assigned
                        to different categories
ternaryAxis             Axis for ternary diagrams
totals                  Total sum of amounts
tryDebugger             Empirical variograms for compositions
ult                     Uncentered log transform
var.acomp               Variances and covariances of amounts and
                        compositions
variation               Variation matrices of amounts and compositions
var.lm                  Residual variance of a model
vcovAcomp               Variance covariance matrix of parameters in
                        compositional regression
vgmFit                  Compositional variogram model fitting
vgram2lrvgram           vgram2lrvgram
vgram.sph               Variogram functions
WhiteCells              White-cell composition of 30 blood samples by
                        two different methods
Yatquat                 Yatquat fruit evaluation
zeroreplace             Zero-replacement routine

To get detailed "getting started" introduction use help.start() or help.start(browser="myfavouritebrowser") Go to "Packages" then "compositions" and then "overview" and then launch the file "UsingCompositions.pdf" from there. Please also check the web-site: http://www.stat.boogaart.de/compositions/ for improved material and our new book expected to appear spring 2009.
The package is devoted to the analysis of multiple amounts. Amounts have typically non-negative values, and often sum up to 100% or one. These constraints lead to spurious effects on the covariance structure, as pointed out by Chayes (1960). The problem is treated rigorously in the monography by Aitchison (1986), who characterizes compositions as vectors having a relative scale, and identifies its sample space with the D-part simplex. However still (i.e. 2005) most statistical packages do not provided any support for this scale.
The grounding idea of the package exploits the class concept: the analyst gives the data a compositional or amount class, and then all further analysis are (should be) automatically done in a consistent way, e.g. x <- acomp(X); plot(x) should plot the data as a composition (in a ternary diagram) directly without any further interaction of the user.
The package provides four different approaches to analyse amounts. These approaches are associated to four R-classes, representing four different geometries of the sampling space of amounts. These geometries depend on two questions: whether the total sum of the amounts is a relevant information, and which is the meaningful measure of difference of the data.

rplus : (Real Plus) The total amount matters, and amounts should be compared on an absolute basis. i.e. the difference between 1g and 2g is the same as the difference between 1kg and 1001g, one gram.
aplus : (Aitchison Plus) The total amount matters, but amounts should be compared relatively, i.e. the difference between 1mg and 2mg is the same as that of 1g and 2g: the double.
acomp : (Aitchison composition) the total amount is constant (or an artifact of the sampling/measurement procedure), and the meaningful difference is a relative one. This class follows the original proposals of Aitchison.
rcomp : (Real composition) the sum is a constant, and the difference in amount from 0% to 1% and from 10% to 11% is regarded as equal. This class represents the raw/naive treatment of compositions as elements of the real simplex based on an absolute geometry. This treatment is implicitly used in most amalgamation problems. However the whole approach suffers from the drawbacks and problems discussed in Chayes (1960) and Aitchison (1986).
The aim of the package is to provide all the functionality to do a consistent analysis in all of these approaches and to make the results obtained with different geometries as easy to compare as possible.

Note

The package compositions has grown a lot in the last year: missings, robust estimations, outlier detection and classification, codadendrogram. This makes everything much more complex especially from the side of programm testing. Thus we would like to urge our users to report all errors and problems of the lastest version (please check first) to support@boogaart.de.

Author(s)

K. Gerald van den Boogaart <boogaart@hzdr.de>, Raimon Tolosana-Delgado, Matevz Bren

Maintainer: K. Gerald van den Boogaart <support@boogaart.de>

References

Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics and Applied Probability. Chapman & Hall Ltd., London (UK). 416p.

Aitchison, J, C. Barcel'o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the algebraic geometric structure of the simplex, the sample space for compositional data analysis, Terra Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003

Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition, Journal of the American Statistical Association, 96 (456), 1205-1214

Chayes, F. (1960). On correlation between variables of constant sum. Journal of Geophysical Research 65~(12), 4185–4193.

Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on the simplex. SERRA 15(5), 384-398

Pawlowsky-Glahn, V. (2003) Statistical modelling on coordinates. In: Thi\'o -Henestrosa, S. and Mart\'in-Fern\'a ndez, J.A. (Eds.) Proceedings of the 1st International Workshop on Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-111-X, https://ima.udg.edu/Activitats/CoDaWork03/

Mateu-Figueras, G. and Barcel\'o-Vidal, C. (Eds.) Proceedings of the 2nd International Workshop on Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-222-1, https://ima.udg.edu/Activitats/CoDaWork05/

van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to analyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi: 10.1016/j.cageo.2006.11.017.

See Also

compositions-package, missingsInCompositions, robustnessInCompositions, outliersInCompositions,

Examples

library(compositions)      # load library
data(SimulatedAmounts)     # load data sa.lognormals
x <- acomp(sa.lognormals)  # Declare the dataset to be compositional
                           # and use relative geometry
plot(x)                    # plot.acomp : ternary diagram
ellipses(mean(x),var(x),r=2,col="red")  # Simplex 2sigma predictive region
pr <- princomp(x)
straight(mean(x),pr$Loadings) 

x <- rcomp(sa.lognormals)  # Declare the dataset to be compositional
                           # and use absolute geometry
plot(x)                    # plot.acomp : ternary diagram
ellipses(mean(x),var(x),r=2,col="red")  # Real 2sigma predictive region
pr <- princomp(x)          
straight(mean(x),pr$Loadings) 

[Package compositions version 2.0-8 Index]