DirichletRegData {DirichletReg} | R Documentation |
Prepare Compositional Data
Description
This function prepares a matrix with compositional variables for further processing in the DirichletReg package.
Usage
DR_data(Y, trafo = sqrt(.Machine$double.eps), base = 1,
norm_tol = sqrt(.Machine$double.eps))
## S3 method for class 'DirichletRegData'
print(x, type = c("processed", "original"), ...)
## S3 method for class 'DirichletRegData'
summary(object, ...)
Arguments
Y |
A |
trafo |
Either a logical or numeric value.
Transformation of variables causes the values to shrink away from extreme values of 0 and 1, see “Details”.
|
base |
The “base” component to use in the reparametrized model |
norm_tol |
Due to numerical precision, row sums of |
x |
A |
type |
Displays either the (possibly normalized or transformed) |
object |
A |
... |
Further arguments |
Details
Y
Y
is a matrix
or data.frame
containing compositional variables.
If they do not sum up to 1 for all observations, normalization is forced where each row entry is divided by the row's sum (a warning will be issued that normalization was applied).
In case one row-entry (or more) is NA
, the whole row will be returned as NA
.
Beta-distributed variables can be supplied as a single vector which, however, has to have values in the interval .
The second variable will be generated (
1 - Y
) and a matrix
consisting of the columns 1 - Y
and Y
will be returned.
A message will be issued that a beta-distributed variable was assumed and that this assumtion needs to be checked.
trafo
The transformation (done if trafo = TRUE
) is a generalization of that proposed by Smithson and Verkuilen (2006) that transforms each component of
by computing
where
is the number of observations in
(this approach is also used in the package betareg, see Cribari-Neto & Zeileis, 2010).
For an arbitrary number of dimensions (or variables) the transformation is
.
base
To set the base (i.e., omitted) component of Y
for the “alternative” (mean/precision) model, the argument base
can be used. This is by default set to the first variable in Y
(if a vector is be supplied, the column 1 - Y
becomes the base component).
Note that the definition can be overruled in DirichReg
.
x
and object
Objects created by DR_data
.
type
specifies for the print method whether the original or processed data are displayed.
Value
The function returns a matrix
object of class DirichletRegData
with the following attributes:
attr(* , "dimnames") |
a list with two entries, row names (by default |
attr(* , "Y.original") |
the original data |
attr(* , "dims") |
number of dimensions of |
attr(* , "dim.names") |
the number of components in |
attr(* , "obs") |
number of observations of |
attr(* , "valid_obs") |
number of valid observations |
attr(* , "normalized") |
a logical value indicating whether the data were normalized |
attr(* , "transformed") |
a logical value indicating whether the data were transformed |
attr(* , "base") |
number of the variable used as the base in the reparametrized model |
Author(s)
Marco J. Maier
References
Smithson, M. & Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables. Psychological Methods, 11(1), 54–71.
Cribari-Neto, F. & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24.
Examples
# create a DirichletRegData object from the Arctic Lake data
head(ArcticLake[, 1:3])
AL <- DR_data(ArcticLake[, 1:3])
summary(AL)
head(AL)