R: Density-Box-Plots

Density-Box-Plot {REdaS}

R Documentation

Density-Box-Plots

Description

This function draws a (grouped) boxplot-like plot with with kernel density estimators.

Usage

densbox(formula, data, rug = FALSE, from, to, gsep = .5, kernel, bw, main, ylab,
    var_names, box_out = TRUE, horizontal = FALSE, ...)

Arguments

`formula`	a `formula` object that references elements in `data`, see Details
`data`	a data frame containing the variables specified in formula
`rug`	a logical value to add a rug to the individual density-boxes
`from`	an optional lower boundary for the kernel density estimation (see `density`)
`to`	an optional upper boundary for the kernel density estimation (see `density`)
`gsep`	a numeric value `\geq0` that specifies the length of group separation if two or more grouping variables are used
`kernel`	a string specifying the type of the kernel (default: `"gaussian"`, see `density`)
`bw`	the bandwidth for kernel density estimation (see `density`)
`main`	a character object for the title
`ylab`	a character object for the `y`-axis label
`var_names`	a character object to print grouping variables' names in the lower left margin – grouping variables are treated in the order they are given in the formula
`box_out`	if `TRUE`, outliers treated as in standard boxplots (plotted as stars outside the boxplot's whiskers; default), if `FALSE`, outliers are not treated differently, i.e., minimum and maximum will be over the full range, no matter how far individual observations may be from the median with respect to the IQR (interquartile range; see `boxplot.stats` and `fivenum` for details on the computation of boxplot statistics).
`horizontal`	not implemented yet...
`...`	further arguments, see Details

Details

This function plots a combination of boxplots and kernel density plots to get a more informative graphic of a metric dependent variable with respect to grouped data. The central element is the formula argument that defines the dependent variable (dv) and grouping variables (independent variables, iv). For a meaningful plot, the ivs should be categorical variables (they are treated as factors).

In the simplest case, there is no grouping, so formula is DV ~ 1. As grouping variables are added, the plot will be split up accordingly. Note that the ordering of ivs in the formula defines how the plot is split up – the first variable is the most general grouping, the second will form subgroups in the first variable's groups and so on ...

If there are cases where a level of a factor is completely missing ab initio, the level will be dropped. Subgroups with less than 5 observations will be dropped and “<5” will be plotted instead.

Author(s)

Marco J. Maier

Examples

# plot a density-box-plot of one (log-normal) variable
set.seed(5L)
data1 <- rlnorm(100, 1, .5)
densbox(data1 ~ 1, from = 0, rug = TRUE)

# plots a continuous variable in (0, 1) with 2 grouping variables
data2 <- data.frame(y  = rnorm(400, rep(c(0, 1, -1, 0), each = 100), 1),
                    x1 = rep(c("A", "B"), each = 200),
                    x2 = rep(c("X", "Y", "X", "Y"), each = 100))
with(data2, tapply(y, list(x1, x2), mean))

# a density-box-plot of the data with the kernel density
# estimator constrained to the interval 0 to 1
densbox(y ~ x2 + x1, data2, main = "Plot with some\nSpecials",
  var_names = c("Second\nVariable", "First Variable"))

# the same plot with a rug and ignoring outliers in the boxplot
densbox(y ~ x2 + x1, data2, rug = TRUE, box_out = FALSE)

# density-box-plot with the same data, but no additional space between groups
# by setting gsep = 0.
# the kernel density plots have a rectangular kernel with a bandwidth of 0.25
# which results in a "jagged" appearance.
densbox(y ~ x2 + x1, data2, gsep = 0, kernel = "rectangular", bw = 0.25)

[Package REdaS version 0.9.4 Index]