structree {structree} | R Documentation |
Tree-Structured Clustering
Description
Fusion of categories of ordinal or nominal predictors or fusion of measurement units by tree-structured clustering.
Usage
structree(formula, data, family = gaussian, stop_criterion = c("AIC",
"BIC", "CV", "pvalue"), splits_max = NULL, fold = 5, alpha = 0.05,
grid_value = NULL, min_border = NULL, ridge = FALSE,
constant_covs = FALSE, trace = TRUE, plot = TRUE, k = 10,
weights = NULL, offset = NULL, ...)
## S3 method for class 'structree'
print(x, ...)
## S3 method for class 'structree'
coef(object, ...)
Arguments
formula |
Object of class |
data |
Data.frame of class |
family |
a description of the error distribution and link function to be used in the model.
This can be a character string naming a family function, a family function or the result of a call to a family function.
See |
stop_criterion |
Criterion to determine the optimal number of splits in the tree component of the model;
one out of |
splits_max |
Maximal number of splits in the tree component. |
fold |
Number of folds; only for stop criterion |
alpha |
Significance level; only for stop criterion |
grid_value |
An optional parameter; |
min_border |
An optional parameter; |
ridge |
If true, a small ridge penalty is added to obtain the order of measurement units; only for repeated measurements. |
constant_covs |
Must be set to true, if constant covariates are available; only for repeated measurments (currently only available for Gaussian response). |
trace |
If true, information about the estimation progress is printed. |
plot |
If true, the smooth components of the model are plottet; only for categorical predictors. |
k |
Dimension of the B-spline basis that is used to fit smooth components. For details see |
weights |
An optional vector of prior weights to be used in the fitting process; see also |
offset |
An a priori known component to be included in the linear predictor during fitting; see also |
... |
Further arguments passed to or from other methods. |
x , object |
Object of class |
Details
A typical formula has the form response ~ predictors
, where response
is the name of the response variable
and predictors
is a series of terms that specify the predictor of the model.
For an ordinal or nominal predictors z one has to enter tr(x)
into the formula.
For smooth components x one has to enter s(x)
into the formula; currently not implemented for repeated measurements.
For fixed effects z of observation units u one has to enter tr(z|u)
into the formula.
An unit-specific intercept is specified by tr(1|u)
.
The framework only allows for categorical predictors or observations units in the tree component, but not both.
All other predictors with a linear term are entered as usual by x1+...+xp
.
Value
Object of class "structree"
.
An object of class "structree"
is a list containing the following components:
coefs_end |
all coefficients of the estimated model |
partitions |
list of matrices containing the partitions of the predictors in the tree component including all iterations |
beta_hat |
list of matrices with the fitted coefficients in the tree component including all iterations |
which_opt |
number of the optimal model (total number of splits-1) |
opts |
number of splits per predictor in the tree component |
order |
list of ordered split-points of the predictors in the tree component |
tune_values |
value of the stopping criterion that determine the optimal model |
group_ID |
list of the group IDs for each observations |
coefs_group |
list of coefficients of the estimated model |
y |
Response vector |
DM_kov |
Design matrix |
Author(s)
Moritz Berger <Moritz.Berger@imbie.uni-bonn.de>
http://www.imbie.uni-bonn.de/personen/dr-moritz-berger/
References
Tutz, Gerhard and Berger, Moritz (2018): Tree-structured modelling of categorical predictors in regression, Advances in Data Analysis and Classification 12(3), 737-758.
Berger, Moritz and Tutz, Gerhard (2018): Tree-structured clustering in fixed effects models, Journal of Computational and Graphical Statistics 27(2), 380-392.
See Also
Examples
data(rent)
## Not run:
mod <- structree(nmqm~tr(bez)+tr(bj)+tr(rooms)+badkach0,data=rent,
family=gaussian,stop_criterion="CV")
print(mod)
coef(mod)
## End(Not run)