plotPair {diversityForest}R Documentation

Plot of the (estimated) simultaneous influence of two variables

Description

This function allows to visualise the (estimated) bivariable influence of a single specific pair of variables on the outcome. The estimation and plotting is performed in the same way as in plotEffects. However, plotPair does not require an interactionfor object and can thus be used also without a constructed interaction forest.

Usage

plotPair(
  pair,
  yvarname,
  statusvarname = NULL,
  data,
  levelsorder1 = NULL,
  levelsorder2 = NULL,
  categprob = NULL,
  pvalue = TRUE,
  returnseparate = FALSE,
  intobj = NULL
)

Arguments

pair

Character string vector of length two, where the first character string gives the name of the first member of the respective pair to plot and the second character string gives the name of the second member. Note that the order of the two pair members in pair determines how the results are visualised: The estimated influence of the second pair member is visualised conditionally on different values of the first pair member.

yvarname

Name of outcome variable.

statusvarname

Name of status variable, only applicable to survival data.

data

Data frame containing the variables.

levelsorder1

Optional. Order the categories of the first variable should have in the plot (if it is categorical). Character string vector, where the i-th entry contains the name of the category that should take the i-th place in the ordering of the categories of the first variable.

levelsorder2

Optional. Order the categories of the second variable should have in the plot (if it is categorical). Character string vector specified in an analogous way as levelsorder1.

categprob

Optional. Only relevant for categorical outcomes with more than two classes. Name of the class for which probabilities should be estimated. As described in plotEffects, for categorical outcomes with more than two classes, by default the probabilities for the largest class (i.e., the class with the most observations) are estimated when visualising the bivariable influence of the variables. Using categprob a different class can be specified for the class for which probabilities should be estimated.

pvalue

Set to TRUE (default) to add to the plot a p-value from a test for interaction effect obtained using a classical parametric regression approach. For categorical outcomes logistic regression is used, for metric outcomes linear regression and for survival outcomes Cox regression. See the 'Details' section of plotEffects for further details.

returnseparate

Set to TRUE to return invisibly the two generated ggplot plots separately in the form of a list. The latter option is useful, because it allows to manipulate the resulting plots (label size etc.) and makes it possible to consider only one of the two plots. The default is FALSE, which results in the two plots being returned together in the form of a ggarrange object.

intobj

Optional. Object of class interactionfor. If this is provided, the ordering of the categories obtained when constructing the interaction forest will be used for categorical variables. See Hornung & Boulesteix (2022) for details.

Details

See the 'Details' section of plotEffects.

Value

A ggplot2 plot.

Author(s)

Roman Hornung

References

See Also

plotEffects, plot.interactionfor

Examples

## Not run: 

## Load package:

library("diversityForest")



## Visualise the estimated bivariable influence of 'toothed' and 'feathers' on
## the probability of type="mammal":

data(zoo)
plotPair(pair = c("toothed", "feathers"), yvarname="type", data = zoo)



## Visualise the estimated bivariable influence of 'creat' and 'hgb' on
## survival (more precisely, on the log hazards ratio compared to the
## median effect):

library("survival")
mgus2compl <- mgus2[complete.cases(mgus2),]
plotPair(pair=c("creat", "hgb"), yvarname="futime", statusvarname = "death", data=mgus2compl)

# Problem: The outliers in the left plot make it difficult to see what is going
# on in the region with creat values smaller than about two even though the
# majority of values lie there.

# --> Solution: We re-run the above line setting returnseparate = TRUE, because
# this allows to get the two ggplot plots separately, which can then be manipulated
# to change the x-axis range in order to remove the outliers:

ps <- plotPair(pair=c("creat", "hgb"), yvarname="futime", statusvarname = "death", 
               data=mgus2compl, returnseparate = TRUE)

# Change the x-axis range:
library("ggplot2")
ps[[1]] + xlim(c(0.5,2))
# Save the plot:
# ggsave(file="mypathtofolder/FigureXY1.pdf", width=7, height=6)

# We can, for example, also change the label sizes of the second plot:
# With original label sizes:
ps[[2]]
# With larger label sizes:
ps[[2]] +  theme(axis.title=element_text(size=15))
# Save the plot:
# library("ggplot2")
# ggsave(file="mypathtofolder/FigureXY2.pdf", width=7, height=6)


## End(Not run)


[Package diversityForest version 0.4.0 Index]