tran {analogue} | R Documentation |
Provides common data transformations and standardizations useful for
palaeoecological data. The function acts as a wrapper to function
decostand
in package vegan for several of the
available options.
The formula
method allows a convenient method for selecting or
excluding subsets of variables before applying the chosen
transformation.
## Default S3 method: tran(x, method, a = 1, b = 0, p = 2, base = exp(1), na.rm = FALSE, na.value = 0, ...) ## S3 method for class 'formula' tran(formula, data = NULL, subset = NULL, na.action = na.pass, ...)
x |
A matrix-like object. |
method |
transformation or standardization method to apply. See Details for available options. |
a |
Constant to multiply |
b |
Constant to add to |
p |
The power to use in the power transformation. |
base |
the base with respect to which logarithms are
computed. See |
na.rm |
Should missing values be removed before some computations? |
na.value |
The value with which to replace missing values
( |
... |
Arguments passed to |
formula |
A model formula describing the variables to be
transformed. The formula should have only a right hand side,
e.g.~ |
data, subset, na.action |
See |
The function offers following transformation and standardization methods for community data:
sqrt
: take the square roots of the observed values.
cubert
: take the cube root of the observed values.
rootroot
: take the fourth root of the observed
values. This is also known as the root root transformation (Field
et al 1982).
log
: take the logarithms of the observed values. The
tansformation applied can be modified by constants a
and
b
and the base
of the logarithms. The transformation
applied is x* =
log[base](ax + b).
log1p
: computes log(1 + x) accurately also for
|x| << 1 via log1p
. Note the arguments a
and b
have no effect in this method.
expm1
: computes exp(x) - 1) accurately for
|x| << 1 via expm1
.
reciprocal
: returns the multiplicative inverse or
reciprocal, 1/x, of the observed values.
freq
: divide by column (variable, species) maximum and
multiply by the number of non-zero items, so that the average of
non-zero entries is 1 (Oksanen 1983).
center
: centre all variables to zero mean.
range
: standardize values into range 0 ... 1. If all
values are constant, they will be transformed to 0.
percent
: convert observed count values to percentages.
proportion
: convert observed count values to proportions.
standardize
: scale x
to zero mean and unit
variance.
pa
: scale x
to presence/absence scale (0/1).
missing
: replace missing values with na.value
.
chi.square
: divide by row sums and square root of
column sums, and adjust for square root of matrix total
(Legendre & Gallagher 2001). When used with the Euclidean
distance, the distances should be similar to the the
Chi-square distance used in correspondence analysis. However, the
results from cmdscale
would still differ, since
CA is a weighted ordination method.
hellinger
: square root of observed values that have
first been divided by row (site) sums (Legendre & Gallagher 2001).
wisconsin
: applies the Wisconsin double
standardization, where columns (species, variables) are first
standardized by maxima and then sites (rows) by site totals.
pcent2prop
: convert percentages to proportions.
prop2pcent
: convert proportions to percentages.
logRatio
: applies a log ransformation (see log
above) to the data, then centres the data by rows (by subtraction of
the mean for row i from the observations in row
i). Using this transformation subsequent to PCA results in
Aitchison's Log Ratio Analysis (LRA), a means of dealing with closed
compositional data such as common in palaeoecology (Aitchison, 1983).
power
: applies a power tranformation.
rowCentre
, rowCenter
: Centres x
by rows
through the subtraction of the corresponding row mean from the
observations in the row.
colCentre
colCenter
: Centres x
by columns
through the subtraction of the corresponding column mean from the
observations in the row.
none
none
: no transformation is applied.
Returns the suitably transformed or standardized x
. If x
is a data frame, the returned value is like-wise a data frame. The
returned object also has an attribute "tran"
giving the name of
applied transformation or standardization "method"
.
Gavin L. Simpson. Much of the functionality of tran
is
provided by decostand
, written by Jari Oksanen.
Aitchison, J. (1983) Principal components analysis of compositional data. Biometrika 70(1); 57–65.
Field, J.G., Clarke, K.R., & Warwick, R.M. (1982) A practical strategy for analysing multispecies distributions patterns. Marine Ecology Progress Series 8; 37–52.
Legendre, P. & Gallagher, E.D. (2001) Ecologically meaningful transformations for ordination of species data. Oecologia 129; 271-280.
Oksanen, J. (1983) Ordination of boreal heath-like vegetation with principal component analysis, correspondence analysis and multidimensional scaling. Vegetatio 52; 181-189.
data(swapdiat) ## convert percentages to proportions sptrans <- tran(swapdiat, "pcent2prop") ## apply Hellinger transformation spHell <- tran(swapdiat, "hellinger") ## Dummy data to illustrate formula method d <- data.frame(A = runif(10), B = runif(10), C = runif(10)) ## simulate some missings d[sample(10,3), 1] <- NA ## apply tran using formula tran(~ . - B, data = d, na.action = na.pass, method = "missing", na.value = 0)