barPlot {dittoViz}R Documentation

Outputs a stacked bar plot to show the percent composition of samples, groups, clusters, or other groupings

Description

Outputs a stacked bar plot to show the percent composition of samples, groups, clusters, or other groupings

Usage

barPlot(
  data_frame,
  var,
  group.by,
  scale = c("percent", "count"),
  split.by = NULL,
  rows.use = NULL,
  retain.factor.levels = TRUE,
  data.out = FALSE,
  data.only = FALSE,
  do.hover = FALSE,
  hover.round.digits = 5,
  color.panel = dittoColors(),
  colors = seq_along(color.panel),
  split.nrow = NULL,
  split.ncol = NULL,
  split.adjust = list(),
  y.breaks = NA,
  min = 0,
  max = NA,
  var.labels.rename = NULL,
  var.labels.reorder = NULL,
  x.labels = NULL,
  x.labels.rotate = TRUE,
  x.reorder = NULL,
  theme = theme_classic(),
  xlab = group.by,
  ylab = "make",
  main = "make",
  sub = NULL,
  legend.show = TRUE,
  legend.title = NULL
)

Arguments

data_frame

A data_frame where columns are features and rows are observations you might wish to visualize.

var

Single string representing the name of a column of data_frame to quantify within x-axis groups.

group.by

Single string representing the name of a column of data_frame to use for separating data across discrete x-axis groups.

scale

"count" or "percent". Sets whether data should be shown as counts versus percentage.

split.by

1 or 2 strings denoting the name(s) of column(s) of data_frame containing discrete data to use for faceting / separating data points into separate plots.

When 2 columns are named, c(row,col), the first is used as rows and the second is used for columns of the resulting facet grid.

When 1 column is named, shape control can be achieved with split.nrow and split.ncol

rows.use

String vector of rownames of data_frame OR an integer vector specifying the row-indices of data points which should be plotted.

Alternatively, a Logical vector, the same length as the number of rows in data_frame, where TRUE values indicate which rows to plot.

retain.factor.levels

Logical which controls whether factor identities of var and group.by data should be respected. Set to TRUE to faithfully reflect ordering of groupings encoded in factor levels, but Note that this will also force retention of groupings that could otherwise be removed via rows.use.

data.out

Logical. When set to TRUE, changes the output, from the plot alone, to a list containing the plot ("p") and a data.frame ("data") containing the underlying data.

data.only

Logical. When set to TRUE, the underlying data will be returned, but not the plot itself.

do.hover

Logical which sets whether the ggplot output should be converted to a ggplotly object with data about individual bars displayed when you hover your cursor over them.

hover.round.digits

Integer number specifying the number of decimal digits to round displayed numeric values to, when do.hover is set to TRUE.

color.panel

String vector which sets the colors to draw from for data representation fills. Default = dittoColors().

A named vector can be used if names are matched to the distinct values of the color.by data.

colors

Integer vector, the indexes / order, of colors from color.panel to actually use.

Useful for quickly swapping around colors of the default set (when not using names for color matching).

split.nrow, split.ncol

Integers which set the dimensions of faceting/splitting when faceting by a single feature.

split.adjust

A named list which allows extra parameters to be pushed through to the faceting function call. List elements should be valid inputs to the faceting functions, e.g. 'list(scales = "free")'.

For options, when giving 1 column to split.by, see facet_wrap, OR when giving 2 columns to split.by, see facet_grid.

y.breaks

Numeric vector which sets the plot's tick marks / major gridlines. c(break1,break2,break3,etc.)

min, max

Scalars which control the zoom of the plot. These inputs set the minimum / maximum values of the y-axis. Default = set based on the limits of the data, 0 to 1 for scale = "percent", or 0 to maximum count for 0 to 1 for scale = "count".

var.labels.rename

String vector for renaming the distinct identities of var-values. This vector must be the same length as the number of levels or unique values in the var-data.

Hint: use colLevels or unique(data_frame[,var]) to original values.

var.labels.reorder

Integer vector. A sequence of numbers, from 1 to the number of distinct var-value identities, for rearranging the order labels' groupings within the plot space.

Method: Make a first plot without this input. Then, treating the top-most grouping as index 1, and the bottom-most as index n. Values of var.labels.reorder should be these indices, but in the order that you would like them rearranged to be.

x.labels

String vector which will replace the x-axis groupings' labels. Regardless of x.reorder, the first component of x.labels sets the name for the left-most x-axis grouping.

x.labels.rotate

Logical which sets whether the x-axis grouping labels should be rotated.

x.reorder

Integer vector. A sequence of numbers, from 1 to the number of groupings, for rearranging the order of x-axis groupings.

Method: Make a first plot without this input. Then, treating the leftmost grouping as index 1, and the rightmost as index n. Values of x.reorder should be these indices, but in the order that you would like them rearranged to be.

Recommendation for advanced users: If you find yourself coming back to this input too many times, an alternative solution that can be easier long-term is to make the target data into a factor, and to put its levels in the desired order: factor(data, levels = c("level1", "level2", ...)).

theme

A ggplot theme which will be applied before dittoViz adjustments. Default = theme_classic(). See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.

xlab

String which sets the x-axis title. Default is group.by so it defaults to the name of the grouping information. Set to NULL to remove.

ylab

String which sets the y-axis title. Default = "make" and if left as make, a title will be automatically generated.

main

String, sets the plot title

sub

String, sets the plot subtitle

legend.show

Logical. Whether the legend should be displayed. Default = TRUE.

legend.title

String which adds a title to the legend.

Details

The function creates a dataframe containing counts and percent makeup of var identities for each x-axis grouping (determined by the group.by input). If a subset of data points to use is indicated with the rows.use input, only those rows of the data_frame are used for counts and percent makeup calculations. In other words, the row.use input adjusts the universe that compositions are calculated within. Then, a vertical bar plot is generated (ggplot2::geom_col()) showing either percent makeup if scale = "percent", which is the default, or raw counts if scale = "count".

Value

A ggplot plot where discrete data, grouped by sample, condition, cluster, etc. on the x-axis, is shown on the y-axis as either counts or percent-of-total-per-grouping in a stacked barplot.

Alternatively, if data.out = TRUE, a list containing the plot ("p") and a dataframe of the underlying data ("data").

Alternatively, if do.hover = TRUE, a plotly conversion of the ggplot output in which underlying data can be retrieved upon hovering the cursor over the plot.

Many characteristics of the plot can be adjusted using discrete inputs

Author(s)

Daniel Bunis

Examples

example("dittoExampleData", echo = FALSE)

# There are two main inputs for this function, in addition to 'data_frame'.
#  var = typically this will be observation-type annotations or clustering
#    This is the set of observations for which we will calculate frequencies
#    (per each unique value of this data) within each group
#  group.by = how to group observations together
barPlot(
    data_frame = example_df,
    var = "clustering",
    group.by = "groups")

# 'scale' then allows choice of scaling by 'percent' (default) or 'count'
barPlot(example_df, "clustering", group.by = "groups",
    scale = "count")

# Particular observations can be ignored from calculations and plotting using
#   the 'rows.use' input.
#   Here, we'll remove an entire "cluster" from consideration, but notice the
#     fractions will still sum to 1.
barPlot(example_df, "clustering", group.by = "groups",
    rows.use = example_df$clustering!="1")

### Accessing underlying data:
# as data.frame, with plot returned too
barPlot(example_df, "clustering", group.by = "groups",
    data.out = TRUE)
# as data.frame, no plot
barPlot(example_df, "clustering", group.by = "groups",
    data.out = TRUE,
    data.only = TRUE)
# through hovering the cursor over the relevant parts of the plot
if (requireNamespace("plotly", quietly = TRUE)) {
    barPlot(example_df, "clustering", group.by = "groups",
        do.hover = TRUE)
    }


[Package dittoViz version 1.0.1 Index]