isotree.to.graphviz {isotree}R Documentation

Generate GraphViz Dot Representation of Tree

Description

Generate GraphViz representations of model trees in 'dot' format - either separately per tree (the default), or for a single tree if needed (if passing 'tree') Can also be made to output terminal node numbers (numeration starting at one).

These can be loaded as graphs through e.g. 'DiagrammeR::grViz(x)', where 'x' would be the output of this function for a given tree.

Graph format is based on XGBoost's.

Usage

isotree.to.graphviz(
  model,
  output_tree_num = FALSE,
  tree = NULL,
  column_names = NULL,
  column_names_categ = NULL,
  nthreads = model$nthreads
)

Arguments

model

An Isolation Forest object as returned by isolation.forest.

output_tree_num

Whether to make the statements / outputs return the terminal node number instead of the isolation depth. The numeration will start at one.

tree

Tree for which to generate SQL statements or other outputs. If passed, will generate the statements only for that single tree. If passing 'NULL', will generate statements for all trees in the model.

column_names

Column names to use for the numeric columns. If not passed and the model was fit to a 'data.frame', will use the column names from that 'data.frame', which can be found under 'model$metadata$cols_num'. If not passing it and the model was fit to data in a format other than 'data.frame', the columns will be named 'column_N' in the resulting SQL statement. Note that the names will be taken verbatim - this function will not do any checks for e.g. whether they constitute valid SQL or not when exporting to SQL, and will not escape characters such as double quotation marks when exporting to SQL.

column_names_categ

Column names to use for the categorical columns. If not passed, will use the column names from the 'data.frame' to which the model was fit. These can be found under 'model$metadata$cols_cat'.

nthreads

Number of parallel threads to use.

Details

Value

If passing 'tree=NULL', will return a list with one element per tree in the model, where each element consists of an R character / string with the 'dot' format representation of the tree. If passing 'tree', the output will be instead a single character / string element with the 'dot' representation for that tree.

Examples

library(isotree)
set.seed(123)
X <- matrix(rnorm(100 * 3), nrow = 100)
model <- isolation.forest(X, ndim=1, max_depth=3, ntrees=2, nthreads=1)
model_as_graphviz <- isotree.to.graphviz(model)

# These can be parsed and plotted with library 'DiagrammeR'
if (require("DiagrammeR")) {
    # first tree
    DiagrammeR::grViz(model_as_graphviz[[1]])

    DiagrammeR::grViz(model_as_graphviz[[1]])
}

[Package isotree version 0.6.1-1 Index]