plot.interactionfor {diversityForest} | R Documentation |
Plot method for interactionfor
objects
Description
Plot function for interactionfor
objects that allows to obtain a first overview of the result of the
interaction forest analysis. This function visualises the distributions of the EIM values and
the estimated forms of the bivariable influences of the variable pairs with largest quantitative and
qualitative EIM values. Further visual exploration of the result of the interaction
forest analysis should be conducted using plotEffects
.
Usage
## S3 method for class 'interactionfor'
plot(x, numpairsquant = 2, numpairsqual = 2, ...)
Arguments
x |
Object of class |
numpairsquant |
The number of pairs with largest quantitative EIM values to plot. Default is 2. |
numpairsqual |
The number of pairs with largest qualitative EIM values to plot. Default is 2. |
... |
Further arguments passed to or from other methods. |
Details
For details on the plots of the estimated forms of the bivariable influences of the variable pairs see plotEffects
.
NOTE: The p-values shown in the plots are generally much too optimistic and MUST NOT be reported
as the result of a statistical test for significance of interaction. To obtain adjusted p-values that would correspond to
valid tests, it would be possible to multiply these p-values by the number of possible variable pairs,
which would correspond to Bonferroni-adjusted p-values. See the 'Details' section of plotEffects
for further
explanation and guidance. Note, however, that these Bonferroni-adjusted p-values should be interpreted
with caution because, stemming from bivariable models, these p-values do not take the multivariable nature
of the data into account.
NOTE ALSO: As described in Hornung & Boulesteix (2022), in the case of data with larger numbers of variables (larger than 100, but more seriously
for high-dimensional data), the univariable EIM values can be biased. Therefore, it is strongly recommended to interpret the univariable EIM values
with caution, if the data are high-dimensional. If it is of interest to measure the univariable importance of the variables for high-dimensional data,
an additional conventional random forest (e.g., using the ranger
package) should be constructed and the variable importance measure values
of this random forest be used for ranking the univariable effects.
Value
A ggplot2 plot.
Author(s)
Roman Hornung
References
Hornung, R., Boulesteix, A.-L. (2022). Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects. Computational Statistics & Data Analysis 171:107460, <doi:10.1016/j.csda.2022.107460>.
Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <doi:10.1007/s42979-021-00920-1>.
See Also
Examples
## Not run:
## Load package:
library("diversityForest")
## Set seed to make results reproducible:
set.seed(1234)
## Construct interaction forest and calculate EIM values:
data(stock)
model <- interactionfor(dependent.variable.name = "company10", data = stock,
num.trees = 20)
# NOTE: num.trees = 20 (in the above) would be much too small for practical
# purposes. This small number of trees was simply used to keep the
# runtime of the example short.
# The default number of trees is num.trees = 20000 if EIM values are calculated
# and num.trees = 2000 otherwise.
## When using the plot() function without further specifications,
## by default the estimated bivariable influences of the two pairs with largest quantitative
## and qualitative EIM values are shown:
plot(model)
# It is, however, also possible to change the numbers of
# pairs with largest quantitative and qualitative EIM values
# to be shown:
plot(model, numpairsquant = 4, numpairsqual = 3)
## End(Not run)