tran {analogue} | R Documentation |
Common data transformations and standardizations
Description
Provides common data transformations and standardizations useful for
palaeoecological data. The function acts as a wrapper to function
decostand
in package vegan for several of the
available options.
The formula
method allows a convenient method for selecting or
excluding subsets of variables before applying the chosen
transformation.
Usage
## Default S3 method:
tran(x, method, a = 1, b = 0, p = 2, base = exp(1),
na.rm = FALSE, na.value = 0, ...)
## S3 method for class 'formula'
tran(formula, data = NULL, subset = NULL,
na.action = na.pass, ...)
Arguments
x |
A matrix-like object. |
method |
transformation or standardization method to apply. See Details for available options. |
a |
Constant to multiply |
b |
Constant to add to |
p |
The power to use in the power transformation. |
base |
the base with respect to which logarithms are
computed. See |
na.rm |
Should missing values be removed before some computations? |
na.value |
The value with which to replace missing values
( |
... |
Arguments passed to |
formula |
A model formula describing the variables to be
transformed. The formula should have only a right hand side,
e.g.~ |
data , subset , na.action |
See |
Details
The function offers following transformation and standardization methods for community data:
-
sqrt
: take the square roots of the observed values. -
cubert
: take the cube root of the observed values. -
rootroot
: take the fourth root of the observed values. This is also known as the root root transformation (Field et al 1982). -
log
: take the logarithms of the observed values. The tansformation applied can be modified by constantsa
andb
and thebase
of the logarithms. The transformation applied isx^* = \log_{\mathrm{base}}(ax + b)
-
log1p
: computeslog(1 + x)
accurately also for|x| << 1
vialog1p
. Note the argumentsa
andb
have no effect in this method. -
expm1
: computesexp(x) - 1)
accurately for|x| << 1
viaexpm1
. -
reciprocal
: returns the multiplicative inverse or reciprocal,1/x
, of the observed values. -
freq
: divide by column (variable, species) maximum and multiply by the number of non-zero items, so that the average of non-zero entries is 1 (Oksanen 1983). -
center
: centre all variables to zero mean. -
range
: standardize values into range 0 ... 1. If all values are constant, they will be transformed to 0. -
percent
: convert observed count values to percentages. -
proportion
: convert observed count values to proportions. -
standardize
: scalex
to zero mean and unit variance. -
pa
: scalex
to presence/absence scale (0/1). -
missing
: replace missing values withna.value
. -
chi.square
: divide by row sums and square root of column sums, and adjust for square root of matrix total (Legendre & Gallagher 2001). When used with the Euclidean distance, the distances should be similar to the the Chi-square distance used in correspondence analysis. However, the results fromcmdscale
would still differ, since CA is a weighted ordination method. -
hellinger
: square root of observed values that have first been divided by row (site) sums (Legendre & Gallagher 2001). -
wisconsin
: applies the Wisconsin double standardization, where columns (species, variables) are first standardized by maxima and then sites (rows) by site totals. -
pcent2prop
: convert percentages to proportions. -
prop2pcent
: convert proportions to percentages. -
logRatio
: applies a log ransformation (seelog
above) to the data, then centres the data by rows (by subtraction of the mean for row i from the observations in row i). Using this transformation subsequent to PCA results in Aitchison's Log Ratio Analysis (LRA), a means of dealing with closed compositional data such as common in palaeoecology (Aitchison, 1983). -
power
: applies a power tranformation. -
rowCentre
,rowCenter
: Centresx
by rows through the subtraction of the corresponding row mean from the observations in the row. -
colCentre
colCenter
: Centresx
by columns through the subtraction of the corresponding column mean from the observations in the row. -
none
none
: no transformation is applied.
Value
Returns the suitably transformed or standardized x
. If x
is a data frame, the returned value is like-wise a data frame. The
returned object also has an attribute "tran"
giving the name of
applied transformation or standardization "method"
.
Author(s)
Gavin L. Simpson. Much of the functionality of tran
is
provided by decostand
, written by Jari Oksanen.
References
Aitchison, J. (1983) Principal components analysis of compositional data. Biometrika 70(1); 57–65.
Field, J.G., Clarke, K.R., & Warwick, R.M. (1982) A practical strategy for analysing multispecies distributions patterns. Marine Ecology Progress Series 8; 37–52.
Legendre, P. & Gallagher, E.D. (2001) Ecologically meaningful transformations for ordination of species data. Oecologia 129; 271-280.
Oksanen, J. (1983) Ordination of boreal heath-like vegetation with principal component analysis, correspondence analysis and multidimensional scaling. Vegetatio 52; 181-189.
See Also
Examples
data(swapdiat)
## convert percentages to proportions
sptrans <- tran(swapdiat, "pcent2prop")
## apply Hellinger transformation
spHell <- tran(swapdiat, "hellinger")
## Dummy data to illustrate formula method
d <- data.frame(A = runif(10), B = runif(10), C = runif(10))
## simulate some missings
d[sample(10,3), 1] <- NA
## apply tran using formula
tran(~ . - B, data = d, na.action = na.pass,
method = "missing", na.value = 0)