| tree_var {lares} | R Documentation |
Recursive Partitioning and Regression Trees
Description
Fit and plot a rpart model for exploratory purposes using
rpart and rpart.plot libraries.
Usage
tree_var(
df,
y,
type = 2,
max = 3,
min = 20,
cp = 0,
ohse = TRUE,
plot = TRUE,
explain = TRUE,
title = NA,
subtitle = NULL,
...
)
Arguments
df |
Data frame |
y |
Variable or Character. Name of the dependent variable or response. |
type |
Type of plot. Possible values: 0 Draw a split label at each split and a node label at each leaf. 1 Label all nodes, not just leaves.
Similar to 2 Default.
Like 3 Draw separate split labels for the left and right directions. 4 Like 5 Show the split variable name in the interior nodes. |
max |
Integer. Maximal depth of the tree. |
min |
Integer. The minimum number of observations that must exist in a node in order for a split to be attempted. |
cp |
complexity parameter. Any split that does not decrease the overall
lack of fit by a factor of |
ohse |
Boolean. Auto generate One Hot Smart Encoding? |
plot |
Boolean. Return a plot? If not, |
explain |
Boolean. Include a brief explanation on the bottom part of the plot. |
title, subtitle |
Character. Title and subtitle to include in plot.
Set to |
... |
Additional parameters passed to |
Details
This differs from the tree function in S mainly in its handling
of surrogate variables. In most details it follows Breiman
et. al (1984) quite closely. R package tree provides a
re-implementation of tree.
Value
(Invisible) list type 'tree_var' with plot (function), model, predictions, performance metrics, and interpret auxiliary text.
Author(s)
Stephen Milborrow, borrowing heavily from the rpart
package by Terry M. Therneau and Beth Atkinson,
and the R port of that package by Brian Ripley.
References
Breiman L., Friedman J. H., Olshen R. A., and Stone, C. J. (1984) Classification and Regression Trees. Wadsworth.
See Also
Other Exploratory:
corr_cross(),
corr_var(),
crosstab(),
df_str(),
distr(),
freqs_df(),
freqs_list(),
freqs_plot(),
freqs(),
lasso_vars(),
missingness(),
plot_cats(),
plot_df(),
plot_nums()
Other Visualization:
distr(),
freqs_df(),
freqs_list(),
freqs_plot(),
freqs(),
noPlot(),
plot_chord(),
plot_survey(),
plot_timeline()
Examples
data(dft)
# Regression Tree
tree <- tree_var(dft, Fare, subtitle = "Titanic dataset")
tree$plot() # tree plot
tree$model # rpart model object
tree$performance # metrics
# Binary Tree
tree_var(dft, Survived_TRUE, explain = FALSE, cex = 0.8)$plot()
# Multiclass tree
tree_var(dft[, c("Pclass", "Fare", "Age")], Pclass, ohse = FALSE)$plot()