read_sf_dataset {sfarrow} | R Documentation |
Read an Arrow multi-file dataset and create sf
object
Description
Read an Arrow multi-file dataset and create sf
object
Usage
read_sf_dataset(dataset, find_geom = FALSE)
Arguments
dataset |
a |
find_geom |
logical. Only needed when returning a subset of columns.
Should all available geometry columns be selected and added to to the
dataset query without being named? Default is |
Details
This function is primarily for use after opening a dataset with
arrow::open_dataset
. Users can then query the arrow Dataset
using dplyr
methods such as filter
or
select
. Passing the resulting query to this function
will parse the datasets and create an sf
object. The function
expects consistent geographic metadata to be stored with the dataset in
order to create sf
objects.
Value
object of class sf
See Also
open_dataset
, st_read
, st_read_parquet
Examples
# read spatial object
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE)
# create random grouping
nc$group <- sample(1:3, nrow(nc), replace = TRUE)
# use dplyr to group the dataset. %>% also allowed
nc_g <- dplyr::group_by(nc, group)
# write out to parquet datasets
tf <- tempfile() # create temporary location
on.exit(unlink(tf))
# partitioning determined by dplyr 'group_vars'
write_sf_dataset(nc_g, path = tf)
list.files(tf, recursive = TRUE)
# open parquet files from dataset
ds <- arrow::open_dataset(tf)
# create a query. %>% also allowed
q <- dplyr::filter(ds, group == 1)
# read the dataset (piping syntax also works)
nc_d <- read_sf_dataset(dataset = q)
nc_d
plot(sf::st_geometry(nc_d))