DirichletRegData {DirichletReg} | R Documentation |
Prepare Compositional Data
Description
This function prepares a matrix with compositional variables for further processing in the DirichletReg package.
Usage
DR_data(Y, trafo = sqrt(.Machine$double.eps), base = 1,
norm_tol = sqrt(.Machine$double.eps))
## S3 method for class 'DirichletRegData'
print(x, type = c("processed", "original"), ...)
## S3 method for class 'DirichletRegData'
summary(object, ...)
Arguments
Y |
A |
trafo |
Either a logical or numeric value.
Transformation of variables causes the values to shrink away from extreme values of 0 and 1, see “Details”.
|
base |
The “base” component to use in the reparametrized model |
norm_tol |
Due to numerical precision, row sums of |
x |
A |
type |
Displays either the (possibly normalized or transformed) |
object |
A |
... |
Further arguments |
Details
Y
Y
is a matrix
or data.frame
containing compositional variables.
If they do not sum up to 1 for all observations, normalization is forced where each row entry is divided by the row's sum (a warning will be issued that normalization was applied).
In case one row-entry (or more) is NA
, the whole row will be returned as NA
.
Beta-distributed variables can be supplied as a single vector which, however, has to have values in the interval [0,\,1]
.
The second variable will be generated (1 - Y
) and a matrix
consisting of the columns 1 - Y
and Y
will be returned.
A message will be issued that a beta-distributed variable was assumed and that this assumtion needs to be checked.
trafo
The transformation (done if trafo = TRUE
) is a generalization of that proposed by Smithson and Verkuilen (2006) that transforms each component y
of Y
by computing y^{*}=\frac{y(n-1)+\frac{1}{2}}{n}
where n
is the number of observations in Y
(this approach is also used in the package betareg, see Cribari-Neto & Zeileis, 2010).
For an arbitrary number of dimensions (or variables) d
the transformation is y^{*}=\frac{y(n-1)+\frac{1}{d}}{n}
.
base
To set the base (i.e., omitted) component of Y
for the “alternative” (mean/precision) model, the argument base
can be used. This is by default set to the first variable in Y
(if a vector is be supplied, the column 1 - Y
becomes the base component).
Note that the definition can be overruled in DirichReg
.
x
and object
Objects created by DR_data
.
type
specifies for the print method whether the original or processed data are displayed.
Value
The function returns a matrix
object of class DirichletRegData
with the following attributes:
attr(* , "dimnames") |
a list with two entries, row names (by default |
attr(* , "Y.original") |
the original data |
attr(* , "dims") |
number of dimensions of |
attr(* , "dim.names") |
the number of components in |
attr(* , "obs") |
number of observations of |
attr(* , "valid_obs") |
number of valid observations |
attr(* , "normalized") |
a logical value indicating whether the data were normalized |
attr(* , "transformed") |
a logical value indicating whether the data were transformed |
attr(* , "base") |
number of the variable used as the base in the reparametrized model |
Author(s)
Marco J. Maier
References
Smithson, M. & Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables. Psychological Methods, 11(1), 54–71.
Cribari-Neto, F. & Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24.
Examples
# create a DirichletRegData object from the Arctic Lake data
head(ArcticLake[, 1:3])
AL <- DR_data(ArcticLake[, 1:3])
summary(AL)
head(AL)