summarise_quantile {CDMConnector}R Documentation

Quantile calculation using dbplyr

Description

This function provides DBMS independent syntax for quantiles estimation. Can be used by itself or in combination with mutate() when calculating other aggregate metrics (min, max, mean).

summarise_quantile(), summarize_quantile(), summariseQuantile() and summarizeQuantile() are synonyms.

Usage

summarise_quantile(.data, x = NULL, probs, name_suffix = "value")

summarize_quantile(.data, x = NULL, probs, name_suffix = "value")

summariseQuantile(.data, x = NULL, probs, nameSuffix = "value")

summarizeQuantile(.data, x = NULL, probs, nameSuffix = "value")

Arguments

.data

lazy data frame backed by a database query.

x

column name whose sample quantiles are wanted.

probs

numeric vector of probabilities with values in [0,1].

name_suffix, nameSuffix

character; is appended to numerical quantile value as a column name part.

Details

Implemented quantiles estimation algorithm returns values analogous to ⁠quantile{stats}⁠ with argument type = 1. See discussion in Hyndman and Fan (1996). Results differ from PERCENTILE_CONT natively implemented in various DBMS, where returned values are equal to ⁠quantile{stats}⁠ with default argument type = 7

Value

An object of the same type as '.data'

Examples

## Not run: 
con <- DBI::dbConnect(duckdb::duckdb())
mtcars_tbl <- dplyr::copy_to(con, mtcars, name = "tmp", overwrite = TRUE, temporary = TRUE)

df <- mtcars_tbl %>%
 dplyr::group_by(cyl) %>%
 dplyr::mutate(mean = mean(mpg, na.rm = TRUE)) %>%
 summarise_quantile(mpg, probs = c(0, 0.2, 0.4, 0.6, 0.8, 1),
                    name_suffix = "quant") %>%
 dplyr::collect()

DBI::dbDisconnect(con, shutdown = TRUE)

## End(Not run)

[Package CDMConnector version 1.5.0 Index]