write_sf_dataset {sfarrow}R Documentation

Write sf object to an Arrow multi-file dataset

Description

Write sf object to an Arrow multi-file dataset

Usage

write_sf_dataset(
  obj,
  path,
  format = "parquet",
  partitioning = dplyr::group_vars(obj),
  ...
)

Arguments

obj

object of class sf

path

string path referencing a directory for the output

format

output file format ("parquet" or "feather")

partitioning

character vector of columns in obj for grouping or the dplyr::group_vars

...

additional arguments and options passed to arrow::write_dataset

Details

Translate an sf spatial object to data.frame with WKB geometry columns and then write to an arrow dataset with partitioning. Allows for dplyr grouped datasets (using group_by) and uses those variables to define partitions.

Value

obj invisibly

See Also

write_dataset, st_read_parquet

Examples

# read spatial object
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE)

# create random grouping
nc$group <- sample(1:3, nrow(nc), replace = TRUE)

# use dplyr to group the dataset. %>% also allowed
nc_g <- dplyr::group_by(nc, group)

# write out to parquet datasets
tf <- tempfile()  # create temporary location
on.exit(unlink(tf))
# partitioning determined by dplyr 'group_vars'
write_sf_dataset(nc_g, path = tf)

list.files(tf, recursive = TRUE)

# open parquet files from dataset
ds <- arrow::open_dataset(tf)

# create a query. %>% also allowed
q <- dplyr::filter(ds, group == 1)

# read the dataset (piping syntax also works)
nc_d <- read_sf_dataset(dataset = q)

nc_d
plot(sf::st_geometry(nc_d))


[Package sfarrow version 0.4.1 Index]