plot.interactionfor {diversityForest}R Documentation

Plot method for interactionfor objects

Description

Plot function for interactionfor objects that allows to obtain a first overview of the result of the interaction forest analysis. This function visualises the distributions of the EIM values and the estimated forms of the bivariable influences of the variable pairs with largest quantitative and qualitative EIM values. Further visual exploration of the result of the interaction forest analysis should be conducted using plotEffects.

Usage

## S3 method for class 'interactionfor'
plot(x, numpairsquant = 2, numpairsqual = 2, ...)

Arguments

x

Object of class interactionfor.

numpairsquant

The number of pairs with largest quantitative EIM values to plot. Default is 2.

numpairsqual

The number of pairs with largest qualitative EIM values to plot. Default is 2.

...

Further arguments passed to or from other methods.

Details

For details on the plots of the estimated forms of the bivariable influences of the variable pairs see plotEffects.

NOTE: The p-values shown in the plots are generally much too optimistic and MUST NOT be reported as the result of a statistical test for significance of interaction. To obtain adjusted p-values that would correspond to valid tests, it would be possible to multiply these p-values by the number of possible variable pairs, which would correspond to Bonferroni-adjusted p-values. See the 'Details' section of plotEffects for further explanation and guidance. Note, however, that these Bonferroni-adjusted p-values should be interpreted with caution because, stemming from bivariable models, these p-values do not take the multivariable nature of the data into account.

NOTE ALSO: As described in Hornung & Boulesteix (2022), in the case of data with larger numbers of variables (larger than 100, but more seriously for high-dimensional data), the univariable EIM values can be biased. Therefore, it is strongly recommended to interpret the univariable EIM values with caution, if the data are high-dimensional. If it is of interest to measure the univariable importance of the variables for high-dimensional data, an additional conventional random forest (e.g., using the ranger package) should be constructed and the variable importance measure values of this random forest be used for ranking the univariable effects.

Value

A ggplot2 plot.

Author(s)

Roman Hornung

References

See Also

plotEffects

Examples

## Not run: 

## Load package:

library("diversityForest")



## Set seed to make results reproducible:

set.seed(1234)



## Construct interaction forest and calculate EIM values:

data(stock)
model <- interactionfor(dependent.variable.name = "company10", data = stock, 
                        num.trees = 20)

# NOTE: num.trees = 20 (in the above) would be much too small for practical 
# purposes. This small number of trees was simply used to keep the
# runtime of the example short.
# The default number of trees is num.trees = 20000 if EIM values are calculated
# and num.trees = 2000 otherwise.



## When using the plot() function without further specifications,
## by default the estimated bivariable influences of the two pairs with largest quantitative
## and qualitative EIM values are shown:

plot(model)

# It is, however, also possible to change the numbers of
# pairs with largest quantitative and qualitative EIM values
# to be shown:

plot(model, numpairsquant = 4, numpairsqual = 3)


## End(Not run)


[Package diversityForest version 0.4.0 Index]