R: Outputs a stacked bar plot to show the percent composition of...

barPlot {dittoViz}

R Documentation

Outputs a stacked bar plot to show the percent composition of samples, groups, clusters, or other groupings

Description

Outputs a stacked bar plot to show the percent composition of samples, groups, clusters, or other groupings

Usage

barPlot(
  data_frame,
  var,
  group.by,
  scale = c("percent", "count"),
  split.by = NULL,
  rows.use = NULL,
  retain.factor.levels = TRUE,
  data.out = FALSE,
  data.only = FALSE,
  do.hover = FALSE,
  hover.round.digits = 5,
  color.panel = dittoColors(),
  colors = seq_along(color.panel),
  split.nrow = NULL,
  split.ncol = NULL,
  split.adjust = list(),
  y.breaks = NA,
  min = 0,
  max = NA,
  var.labels.rename = NULL,
  var.labels.reorder = NULL,
  x.labels = NULL,
  x.labels.rotate = TRUE,
  x.reorder = NULL,
  theme = theme_classic(),
  xlab = group.by,
  ylab = "make",
  main = "make",
  sub = NULL,
  legend.show = TRUE,
  legend.title = NULL
)

Arguments

`data_frame`	A data_frame where columns are features and rows are observations you might wish to visualize.
`var`	Single string representing the name of a column of `data_frame` to quantify within x-axis groups.
`group.by`	Single string representing the name of a column of `data_frame` to use for separating data across discrete x-axis groups.
`scale`	"count" or "percent". Sets whether data should be shown as counts versus percentage.
`split.by`	1 or 2 strings denoting the name(s) of column(s) of `data_frame` containing discrete data to use for faceting / separating data points into separate plots. When 2 columns are named, c(row,col), the first is used as rows and the second is used for columns of the resulting facet grid. When 1 column is named, shape control can be achieved with `split.nrow` and `split.ncol`
`rows.use`	String vector of rownames of `data_frame` OR an integer vector specifying the row-indices of data points which should be plotted. Alternatively, a Logical vector, the same length as the number of rows in `data_frame`, where `TRUE` values indicate which rows to plot.
`retain.factor.levels`	Logical which controls whether factor identities of `var` and `group.by` data should be respected. Set to TRUE to faithfully reflect ordering of groupings encoded in factor levels, but Note that this will also force retention of groupings that could otherwise be removed via `rows.use`.
`data.out`	Logical. When set to `TRUE`, changes the output, from the plot alone, to a list containing the plot ("p") and a data.frame ("data") containing the underlying data.
`data.only`	Logical. When set to `TRUE`, the underlying data will be returned, but not the plot itself.
`do.hover`	Logical which sets whether the ggplot output should be converted to a ggplotly object with data about individual bars displayed when you hover your cursor over them.
`hover.round.digits`	Integer number specifying the number of decimal digits to round displayed numeric values to, when `do.hover` is set to `TRUE`.
`color.panel`	String vector which sets the colors to draw from for data representation fills. Default = `dittoColors()`. A named vector can be used if names are matched to the distinct values of the `color.by` data.
`colors`	Integer vector, the indexes / order, of colors from `color.panel` to actually use. Useful for quickly swapping around colors of the default set (when not using names for color matching).
`split.nrow`, `split.ncol`	Integers which set the dimensions of faceting/splitting when faceting by a single feature.
`split.adjust`	A named list which allows extra parameters to be pushed through to the faceting function call. List elements should be valid inputs to the faceting functions, e.g. 'list(scales = "free")'. For options, when giving 1 column to `split.by`, see `facet_wrap`, OR when giving 2 columns to `split.by`, see `facet_grid`.
`y.breaks`	Numeric vector which sets the plot's tick marks / major gridlines. c(break1,break2,break3,etc.)
`min`, `max`	Scalars which control the zoom of the plot. These inputs set the minimum / maximum values of the y-axis. Default = set based on the limits of the data, 0 to 1 for `scale = "percent"`, or 0 to maximum count for 0 to 1 for `scale = "count"`.
`var.labels.rename`	String vector for renaming the distinct identities of `var`-values. This vector must be the same length as the number of levels or unique values in the `var`-data. Hint: use `colLevels` or `unique(data_frame[,var])` to original values.
`var.labels.reorder`	Integer vector. A sequence of numbers, from 1 to the number of distinct `var`-value identities, for rearranging the order labels' groupings within the plot space. Method: Make a first plot without this input. Then, treating the top-most grouping as index 1, and the bottom-most as index n. Values of `var.labels.reorder` should be these indices, but in the order that you would like them rearranged to be.
`x.labels`	String vector which will replace the x-axis groupings' labels. Regardless of `x.reorder`, the first component of `x.labels` sets the name for the left-most x-axis grouping.
`x.labels.rotate`	Logical which sets whether the x-axis grouping labels should be rotated.
`x.reorder`	Integer vector. A sequence of numbers, from 1 to the number of groupings, for rearranging the order of x-axis groupings. Method: Make a first plot without this input. Then, treating the leftmost grouping as index 1, and the rightmost as index n. Values of `x.reorder` should be these indices, but in the order that you would like them rearranged to be. Recommendation for advanced users: If you find yourself coming back to this input too many times, an alternative solution that can be easier long-term is to make the target data into a factor, and to put its levels in the desired order: `factor(data, levels = c("level1", "level2", ...))`.
`theme`	A ggplot theme which will be applied before dittoViz adjustments. Default = `theme_classic()`. See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.
`xlab`	String which sets the x-axis title. Default is `group.by` so it defaults to the name of the grouping information. Set to `NULL` to remove.
`ylab`	String which sets the y-axis title. Default = "make" and if left as make, a title will be automatically generated.
`main`	String, sets the plot title
`sub`	String, sets the plot subtitle
`legend.show`	Logical. Whether the legend should be displayed. Default = `TRUE`.
`legend.title`	String which adds a title to the legend.

Details

The function creates a dataframe containing counts and percent makeup of var identities for each x-axis grouping (determined by the group.by input). If a subset of data points to use is indicated with the rows.use input, only those rows of the data_frame are used for counts and percent makeup calculations. In other words, the row.use input adjusts the universe that compositions are calculated within. Then, a vertical bar plot is generated (ggplot2::geom_col()) showing either percent makeup if scale = "percent", which is the default, or raw counts if scale = "count".

Value

A ggplot plot where discrete data, grouped by sample, condition, cluster, etc. on the x-axis, is shown on the y-axis as either counts or percent-of-total-per-grouping in a stacked barplot.

Alternatively, if data.out = TRUE, a list containing the plot ("p") and a dataframe of the underlying data ("data").

Alternatively, if do.hover = TRUE, a plotly conversion of the ggplot output in which underlying data can be retrieved upon hovering the cursor over the plot.

Many characteristics of the plot can be adjusted using discrete inputs

Colors can be adjusted with color.panel and/or colors.
y-axis zoom and tick marks can be adjusted using min, max, and y.breaks.
Titles can be adjusted with main, sub, xlab, ylab, and legend.title arguments.
The legend can be removed by setting legend.show = FALSE.
x-axis labels and groupings can be changed / reordered using x.labels and x.reorder, and rotation of these labels can be turned off with x.labels.rotate = FALSE.
y-axis var-group labels and their order can be changed / reordered using var.labels and var.labels.reorder.

Author(s)

Daniel Bunis

Examples

example("dittoExampleData", echo = FALSE)

# There are two main inputs for this function, in addition to 'data_frame'.
#  var = typically this will be observation-type annotations or clustering
#    This is the set of observations for which we will calculate frequencies
#    (per each unique value of this data) within each group
#  group.by = how to group observations together
barPlot(
    data_frame = example_df,
    var = "clustering",
    group.by = "groups")

# 'scale' then allows choice of scaling by 'percent' (default) or 'count'
barPlot(example_df, "clustering", group.by = "groups",
    scale = "count")

# Particular observations can be ignored from calculations and plotting using
#   the 'rows.use' input.
#   Here, we'll remove an entire "cluster" from consideration, but notice the
#     fractions will still sum to 1.
barPlot(example_df, "clustering", group.by = "groups",
    rows.use = example_df$clustering!="1")

### Accessing underlying data:
# as data.frame, with plot returned too
barPlot(example_df, "clustering", group.by = "groups",
    data.out = TRUE)
# as data.frame, no plot
barPlot(example_df, "clustering", group.by = "groups",
    data.out = TRUE,
    data.only = TRUE)
# through hovering the cursor over the relevant parts of the plot
if (requireNamespace("plotly", quietly = TRUE)) {
    barPlot(example_df, "clustering", group.by = "groups",
        do.hover = TRUE)
    }

[Package dittoViz version 1.0.1 Index]