sdf_select {sparklyr.nested} | R Documentation |
Select nested items
Description
The select
function works well for keeping/dropping top level fields. It does not
however support access to nested data. This function will accept complex field names
such as x.y.z
where z
is a field nested within y
which is in turn
nested within x
. Since R uses "$" to access nested elements and java/scala use ".",
sdf_select(data, x.y.z)
and sdf_select(data, x$y$z)
are equivalent.
Usage
sdf_select(x, ..., .aliases, .drop_parents = TRUE, .full_name = FALSE)
Arguments
x |
An object (usually a |
... |
Fields to select |
.aliases |
Character. Optional. If provided these names will be matched positionally with
selected fields provided in |
.drop_parents |
Logical. If |
.full_name |
Logical. If |
Selection Helpers
dplyr
allows the use of selection helpers (e.g., see everything
).
These helpers only work for top level fields however. For now all nested fields that should
be promoted need to be explicitly identified.
Examples
## Not run:
# produces a dataframe with an array of characteristics nested under
# each unique species identifier
iris_tbl <- copy_to(sc, iris, name="iris")
iris_nst <- iris_tbl %>%
sdf_nest(Sepal_Length, Sepal_Width, .key="Sepal")
# using java-like dot-notation
iris_nst %>%
sdf_select(Species, Petal_Width, Sepal.Sepal_Width)
# using R-like dollar-sign-notation
iris_nst %>%
sdf_select(Species, Petal_Width, Sepal$Sepal_Width)
# using dplyr selection helpers
iris_nst %>%
sdf_select(Species, matches("Petal"), Sepal$Sepal_Width)
## End(Not run)