| tran {analogue} | R Documentation |
Common data transformations and standardizations
Description
Provides common data transformations and standardizations useful for
palaeoecological data. The function acts as a wrapper to function
decostand in package vegan for several of the
available options.
The formula method allows a convenient method for selecting or
excluding subsets of variables before applying the chosen
transformation.
Usage
## Default S3 method:
tran(x, method, a = 1, b = 0, p = 2, base = exp(1),
na.rm = FALSE, na.value = 0, ...)
## S3 method for class 'formula'
tran(formula, data = NULL, subset = NULL,
na.action = na.pass, ...)
Arguments
x |
A matrix-like object. |
method |
transformation or standardization method to apply. See Details for available options. |
a |
Constant to multiply |
b |
Constant to add to |
p |
The power to use in the power transformation. |
base |
the base with respect to which logarithms are
computed. See |
na.rm |
Should missing values be removed before some computations? |
na.value |
The value with which to replace missing values
( |
... |
Arguments passed to |
formula |
A model formula describing the variables to be
transformed. The formula should have only a right hand side,
e.g.~ |
data, subset, na.action |
See |
Details
The function offers following transformation and standardization methods for community data:
-
sqrt: take the square roots of the observed values. -
cubert: take the cube root of the observed values. -
rootroot: take the fourth root of the observed values. This is also known as the root root transformation (Field et al 1982). -
log: take the logarithms of the observed values. The tansformation applied can be modified by constantsaandband thebaseof the logarithms. The transformation applied isx^* = \log_{\mathrm{base}}(ax + b) -
log1p: computeslog(1 + x)accurately also for|x| << 1vialog1p. Note the argumentsaandbhave no effect in this method. -
expm1: computesexp(x) - 1)accurately for|x| << 1viaexpm1. -
reciprocal: returns the multiplicative inverse or reciprocal,1/x, of the observed values. -
freq: divide by column (variable, species) maximum and multiply by the number of non-zero items, so that the average of non-zero entries is 1 (Oksanen 1983). -
center: centre all variables to zero mean. -
range: standardize values into range 0 ... 1. If all values are constant, they will be transformed to 0. -
percent: convert observed count values to percentages. -
proportion: convert observed count values to proportions. -
standardize: scalexto zero mean and unit variance. -
pa: scalexto presence/absence scale (0/1). -
missing: replace missing values withna.value. -
chi.square: divide by row sums and square root of column sums, and adjust for square root of matrix total (Legendre & Gallagher 2001). When used with the Euclidean distance, the distances should be similar to the the Chi-square distance used in correspondence analysis. However, the results fromcmdscalewould still differ, since CA is a weighted ordination method. -
hellinger: square root of observed values that have first been divided by row (site) sums (Legendre & Gallagher 2001). -
wisconsin: applies the Wisconsin double standardization, where columns (species, variables) are first standardized by maxima and then sites (rows) by site totals. -
pcent2prop: convert percentages to proportions. -
prop2pcent: convert proportions to percentages. -
logRatio: applies a log ransformation (seelogabove) to the data, then centres the data by rows (by subtraction of the mean for row i from the observations in row i). Using this transformation subsequent to PCA results in Aitchison's Log Ratio Analysis (LRA), a means of dealing with closed compositional data such as common in palaeoecology (Aitchison, 1983). -
power: applies a power tranformation. -
rowCentre,rowCenter: Centresxby rows through the subtraction of the corresponding row mean from the observations in the row. -
colCentrecolCenter: Centresxby columns through the subtraction of the corresponding column mean from the observations in the row. -
nonenone: no transformation is applied.
Value
Returns the suitably transformed or standardized x. If x
is a data frame, the returned value is like-wise a data frame. The
returned object also has an attribute "tran" giving the name of
applied transformation or standardization "method".
Author(s)
Gavin L. Simpson. Much of the functionality of tran is
provided by decostand, written by Jari Oksanen.
References
Aitchison, J. (1983) Principal components analysis of compositional data. Biometrika 70(1); 57–65.
Field, J.G., Clarke, K.R., & Warwick, R.M. (1982) A practical strategy for analysing multispecies distributions patterns. Marine Ecology Progress Series 8; 37–52.
Legendre, P. & Gallagher, E.D. (2001) Ecologically meaningful transformations for ordination of species data. Oecologia 129; 271-280.
Oksanen, J. (1983) Ordination of boreal heath-like vegetation with principal component analysis, correspondence analysis and multidimensional scaling. Vegetatio 52; 181-189.
See Also
Examples
data(swapdiat)
## convert percentages to proportions
sptrans <- tran(swapdiat, "pcent2prop")
## apply Hellinger transformation
spHell <- tran(swapdiat, "hellinger")
## Dummy data to illustrate formula method
d <- data.frame(A = runif(10), B = runif(10), C = runif(10))
## simulate some missings
d[sample(10,3), 1] <- NA
## apply tran using formula
tran(~ . - B, data = d, na.action = na.pass,
method = "missing", na.value = 0)