tree_var {lares} | R Documentation |
Recursive Partitioning and Regression Trees
Description
Fit and plot a rpart
model for exploratory purposes using
rpart
and rpart.plot
libraries.
Usage
tree_var(
df,
y,
type = 2,
max = 3,
min = 20,
cp = 0,
ohse = TRUE,
plot = TRUE,
explain = TRUE,
title = NA,
subtitle = NULL,
...
)
Arguments
df |
Data frame |
y |
Variable or Character. Name of the dependent variable or response. |
type |
Type of plot. Possible values: 0 Draw a split label at each split and a node label at each leaf. 1 Label all nodes, not just leaves.
Similar to 2 Default.
Like 3 Draw separate split labels for the left and right directions. 4 Like 5 Show the split variable name in the interior nodes. |
max |
Integer. Maximal depth of the tree. |
min |
Integer. The minimum number of observations that must exist in a node in order for a split to be attempted. |
cp |
complexity parameter. Any split that does not decrease the overall
lack of fit by a factor of |
ohse |
Boolean. Auto generate One Hot Smart Encoding? |
plot |
Boolean. Return a plot? If not, |
explain |
Boolean. Include a brief explanation on the bottom part of the plot. |
title , subtitle |
Character. Title and subtitle to include in plot.
Set to |
... |
Additional parameters passed to |
Details
This differs from the tree
function in S mainly in its handling
of surrogate variables. In most details it follows Breiman
et. al (1984) quite closely. R package tree provides a
re-implementation of tree
.
Value
(Invisible) list type 'tree_var' with plot (function), model, predictions, performance metrics, and interpret auxiliary text.
Author(s)
Stephen Milborrow, borrowing heavily from the rpart
package by Terry M. Therneau and Beth Atkinson,
and the R port of that package by Brian Ripley.
References
Breiman L., Friedman J. H., Olshen R. A., and Stone, C. J. (1984) Classification and Regression Trees. Wadsworth.
See Also
Other Exploratory:
corr_cross()
,
corr_var()
,
crosstab()
,
df_str()
,
distr()
,
freqs_df()
,
freqs_list()
,
freqs_plot()
,
freqs()
,
lasso_vars()
,
missingness()
,
plot_cats()
,
plot_df()
,
plot_nums()
Other Visualization:
distr()
,
freqs_df()
,
freqs_list()
,
freqs_plot()
,
freqs()
,
noPlot()
,
plot_chord()
,
plot_survey()
,
plot_timeline()
Examples
data(dft)
# Regression Tree
tree <- tree_var(dft, Fare, subtitle = "Titanic dataset")
tree$plot() # tree plot
tree$model # rpart model object
tree$performance # metrics
# Binary Tree
tree_var(dft, Survived_TRUE, explain = FALSE, cex = 0.8)$plot()
# Multiclass tree
tree_var(dft[, c("Pclass", "Fare", "Age")], Pclass, ohse = FALSE)$plot()