Treee {LDATree}R Documentation

Classification trees with Linear Discriminant Analysis terminal nodes

Description

[Experimental] Fit an LDATree model.

Usage

Treee(
  formula,
  data,
  missingMethod = c("meanFlag", "newLevel"),
  splitMethod = "LDscores",
  pruneMethod = "none",
  numberOfPruning = 10,
  maxTreeLevel = 4,
  minNodeSize = NULL,
  verbose = FALSE
)

Arguments

formula

an object of class formula, which has the form class ~ x1 + x2 + ...

data

a data frame that contains both predictors and the response. Missing values are allowed in predictors but not in the response.

missingMethod

Missing value solutions for numerical variables and factor variables. 'mean', 'median', 'meanFlag', 'medianFlag' are available for numerical variables. 'mode', 'modeFlag', 'newLevel' are available for factor variables. The word 'Flag' in the methods indicates whether a missing flag is added or not. The 'newLevel' method means that all missing values are replaced with a new level rather than imputing them to another existing value.

splitMethod

the splitting rule in LDATree growing process. For now, 'LDscores' is the only available option.

pruneMethod

the model selection method in the LDATree growing process, which controls the size of the tree. By default, it's set to 'none', which applies a direct stopping rule. Alternatively, 'CV' uses the alpha-pruning process from CART. Although 'CV' is often more accurate, it can be slower, especially with large datasets.

numberOfPruning

controls the number of cross-validation in the pruning. It is 10 by default.

maxTreeLevel

controls the largest tree size possible for either a direct-stopping tree or a CV-pruned tree. Adding one extra level (depth) introduces an additional layer of nodes at the bottom of the current tree. e.g., when the maximum level is 1 (or 2), the maximum tree size is 3 (or 7).

minNodeSize

controls the minimum node size. Think carefully before changing this value. Setting a large number might result in early stopping and reduced accuracy. By default, it's set to one plus the number of classes in the response variable.

verbose

a logical. If TRUE, the function provides additional diagnostic messages or detailed output about its progress or internal workings. Default is FALSE, where the function runs silently without additional output.

Details

Unlike other classification trees, LDATree integrates LDA throughout the entire tree-growing process. Here is a breakdown of its distinctive features:

Value

An object of class Treee containing the following components:

Examples

fit <- Treee(Species~., data = iris)
# Use cross-validation to prune the tree
fitCV <- Treee(Species~., data = iris, pruneMethod = "CV")
# prediction
predict(fit,iris)
# plot the overall tree
plot(fit)
# plot a certain node
plot(fit, iris, node = 1)

[Package LDATree version 0.1.2 Index]