q_summarise {timeplyr}R Documentation

Fast grouped quantile summary

Description

collapse and data.table are used for the calculations.

Usage

q_summarise(
  data,
  ...,
  probs = seq(0, 1, 0.25),
  type = 7,
  pivot = c("wide", "long"),
  na.rm = TRUE,
  sort = TRUE,
  .by = NULL,
  .cols = NULL
)

Arguments

data

A data frame.

...

Variables used to calculate quantiles for. Tidy data-masking applies.

probs

Quantile probabilities.

type

An integer from 5-9 specifying which algorithm to use. See quantile for more details.

pivot

Should data be pivoted wide or long? Default is wide.

na.rm

Should NA values be removed? Default is TRUE.

sort

Should groups be sorted? Default is TRUE.

.by

(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.

.cols

(Optional) alternative to ... that accepts a named character vector or numeric vector. If speed is an expensive resource, it is recommended to use this.

Value

A data.table containing the quantile values for each group.

See Also

stat_summarise

Examples

library(timeplyr)
library(dplyr)

# Standard quantiles
iris %>%
  q_summarise(Sepal.Length)
# Quantiles by species
iris %>%
  q_summarise(Sepal.Length, .by = Species)
# Quantiles by species across multiple columns
iris %>%
  q_summarise(Sepal.Length, Sepal.Width,
            probs = c(0, 1),
            .by = Species)
# Long format if one desires, useful for ggplot2
iris %>%
  q_summarise(Sepal.Length, pivot = "long",
            .by = Species)
# Example with lots of groups
set.seed(20230606)
df <- data.frame(x = rnorm(10^5),
                 g = sample.int(10^5, replace = TRUE))
q_summarise(df, x, .by = g, sort = FALSE)


[Package timeplyr version 0.5.0 Index]