plotPair {diversityForest} | R Documentation |
Plot of the (estimated) simultaneous influence of two variables
Description
This function allows to visualise the (estimated) bivariable influence of a single specific pair of variables on the outcome. The estimation
and plotting is performed in the same way as in plotEffects
. However, plotPair
does not require an interactionfor
object
and can thus be used also without a constructed interaction forest.
Usage
plotPair(
pair,
yvarname,
statusvarname = NULL,
data,
levelsorder1 = NULL,
levelsorder2 = NULL,
categprob = NULL,
pvalue = TRUE,
returnseparate = FALSE,
intobj = NULL
)
Arguments
pair |
Character string vector of length two, where the first character string gives the name of the first member of the respective pair to plot and the second character string gives the name of the second member.
Note that the order of the two pair members in |
yvarname |
Name of outcome variable. |
statusvarname |
Name of status variable, only applicable to survival data. |
data |
Data frame containing the variables. |
levelsorder1 |
Optional. Order the categories of the first variable should have in the plot (if it is categorical). Character string vector, where the i-th entry contains the name of the category that should take the i-th place in the ordering of the categories of the first variable. |
levelsorder2 |
Optional. Order the categories of the second variable should have in the plot (if it is categorical). Character string vector specified in an analogous
way as |
categprob |
Optional. Only relevant for categorical outcomes with more than two classes.
Name of the class for which probabilities should be estimated. As described in |
pvalue |
Set to |
returnseparate |
Set to |
intobj |
Optional. Object of class |
Details
See the 'Details' section of plotEffects
.
Value
A ggplot2 plot.
Author(s)
Roman Hornung
References
Hornung, R., Boulesteix, A.-L. (2022). Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects. Computational Statistics & Data Analysis 171:107460, <doi:10.1016/j.csda.2022.107460>.
Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <doi:10.1007/s42979-021-00920-1>.
See Also
plotEffects
, plot.interactionfor
Examples
## Not run:
## Load package:
library("diversityForest")
## Visualise the estimated bivariable influence of 'toothed' and 'feathers' on
## the probability of type="mammal":
data(zoo)
plotPair(pair = c("toothed", "feathers"), yvarname="type", data = zoo)
## Visualise the estimated bivariable influence of 'creat' and 'hgb' on
## survival (more precisely, on the log hazards ratio compared to the
## median effect):
library("survival")
mgus2compl <- mgus2[complete.cases(mgus2),]
plotPair(pair=c("creat", "hgb"), yvarname="futime", statusvarname = "death", data=mgus2compl)
# Problem: The outliers in the left plot make it difficult to see what is going
# on in the region with creat values smaller than about two even though the
# majority of values lie there.
# --> Solution: We re-run the above line setting returnseparate = TRUE, because
# this allows to get the two ggplot plots separately, which can then be manipulated
# to change the x-axis range in order to remove the outliers:
ps <- plotPair(pair=c("creat", "hgb"), yvarname="futime", statusvarname = "death",
data=mgus2compl, returnseparate = TRUE)
# Change the x-axis range:
library("ggplot2")
ps[[1]] + xlim(c(0.5,2))
# Save the plot:
# ggsave(file="mypathtofolder/FigureXY1.pdf", width=7, height=6)
# We can, for example, also change the label sizes of the second plot:
# With original label sizes:
ps[[2]]
# With larger label sizes:
ps[[2]] + theme(axis.title=element_text(size=15))
# Save the plot:
# library("ggplot2")
# ggsave(file="mypathtofolder/FigureXY2.pdf", width=7, height=6)
## End(Not run)