| DirichletRegData {DirichletReg} | R Documentation |
Prepare Compositional Data
Description
This function prepares a matrix with compositional variables for further processing in the DirichletReg package.
Usage
DR_data(Y, trafo = sqrt(.Machine$double.eps), base = 1,
norm_tol = sqrt(.Machine$double.eps))
## S3 method for class 'DirichletRegData'
print(x, type = c("processed", "original"), ...)
## S3 method for class 'DirichletRegData'
summary(object, ...)
Arguments
Y |
A |
trafo |
Either a logical or numeric value.
Transformation of variables causes the values to shrink away from extreme values of 0 and 1, see “Details”.
|
base |
The “base” component to use in the reparametrized model |
norm_tol |
Due to numerical precision, row sums of |
x |
A |
type |
Displays either the (possibly normalized or transformed) |
object |
A |
... |
Further arguments |
Details
Y
Y is a matrix or data.frame containing compositional variables.
If they do not sum up to 1 for all observations, normalization is forced where each row entry is divided by the row's sum (a warning will be issued that normalization was applied).
In case one row-entry (or more) is NA, the whole row will be returned as NA.
Beta-distributed variables can be supplied as a single vector which, however, has to have values in the interval [0,\,1].
The second variable will be generated (1 - Y) and a matrix consisting of the columns 1 - Y and Y will be returned.
A message will be issued that a beta-distributed variable was assumed and that this assumtion needs to be checked.
trafo
The transformation (done if trafo = TRUE) is a generalization of that proposed by Smithson and Verkuilen (2006) that transforms each component y of Y by computing y^{*}=\frac{y(n-1)+\frac{1}{2}}{n} where n is the number of observations in Y (this approach is also used in the package betareg, see Cribari-Neto & Zeileis, 2010).
For an arbitrary number of dimensions (or variables) d the transformation is y^{*}=\frac{y(n-1)+\frac{1}{d}}{n}.
base
To set the base (i.e., omitted) component of Y for the “alternative” (mean/precision) model, the argument base can be used. This is by default set to the first variable in Y (if a vector is be supplied, the column 1 - Y becomes the base component).
Note that the definition can be overruled in DirichReg.
x and object
Objects created by DR_data.
type
specifies for the print method whether the original or processed data are displayed.
Value
The function returns a matrix object of class DirichletRegData with the following attributes:
attr(*, "dimnames") |
a list with two entries, row names (by default |
attr(*, "Y.original") |
the original data |
attr(*, "dims") |
number of dimensions of |
attr(*, "dim.names") |
the number of components in |
attr(*, "obs") |
number of observations of |
attr(*, "valid_obs") |
number of valid observations |
attr(*, "normalized") |
a logical value indicating whether the data were normalized |
attr(*, "transformed") |
a logical value indicating whether the data were transformed |
attr(*, "base") |
number of the variable used as the base in the reparametrized model |
Author(s)
Marco J. Maier
References
Smithson, M. & Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables. Psychological Methods, 11(1), 54–71.
Cribari-Neto, F. & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24.
Examples
# create a DirichletRegData object from the Arctic Lake data
head(ArcticLake[, 1:3])
AL <- DR_data(ArcticLake[, 1:3])
summary(AL)
head(AL)