sdf_pivot {sparklyr} | R Documentation |
Pivot a Spark DataFrame
Description
Construct a pivot table over a Spark Dataframe, using a syntax similar to
that from reshape2::dcast
.
Usage
sdf_pivot(x, formula, fun.aggregate = "count")
Arguments
x |
A |
formula |
A two-sided R formula of the form |
fun.aggregate |
How should the grouped dataset be aggregated? Can be a length-one character vector, giving the name of a Spark aggregation function to be called; a named R list mapping column names to an aggregation method, or an R function that is invoked on the grouped dataset. |
Examples
## Not run:
library(sparklyr)
library(dplyr)
sc <- spark_connect(master = "local")
iris_tbl <- sdf_copy_to(sc, iris, name = "iris_tbl", overwrite = TRUE)
# aggregating by mean
iris_tbl %>%
mutate(Petal_Width = ifelse(Petal_Width > 1.5, "High", "Low")) %>%
sdf_pivot(Petal_Width ~ Species,
fun.aggregate = list(Petal_Length = "mean")
)
# aggregating all observations in a list
iris_tbl %>%
mutate(Petal_Width = ifelse(Petal_Width > 1.5, "High", "Low")) %>%
sdf_pivot(Petal_Width ~ Species,
fun.aggregate = list(Petal_Length = "collect_list")
)
## End(Not run)
[Package sparklyr version 1.8.6 Index]