R: Read geospatial data into a Spark DataFrame.

spark_read_shapefile {apache.sedona}

R Documentation

Read geospatial data into a Spark DataFrame.

Description

Functions to read geospatial data from a variety of formats into Spark DataFrames.

spark_read_shapefile: from a shapefile
spark_read_geojson: from a geojson file
spark_read_geoparquet: from a geoparquet file

Usage

spark_read_shapefile(sc, name = NULL, path = name, options = list(), ...)

spark_read_geojson(
  sc,
  name = NULL,
  path = name,
  options = list(),
  repartition = 0,
  memory = TRUE,
  overwrite = TRUE
)

spark_read_geoparquet(
  sc,
  name = NULL,
  path = name,
  options = list(),
  repartition = 0,
  memory = TRUE,
  overwrite = TRUE
)

Arguments

`sc`	A `spark_connection`.
`name`	The name to assign to the newly generated table.
`path`	The path to the file. Needs to be accessible from the cluster. Supports the ‘⁠"hdfs://"⁠’, ‘⁠"s3a://"⁠’ and ‘⁠"file://"⁠’ protocols.
`options`	A list of strings with additional options. See https://spark.apache.org/docs/latest/sql-programming-guide.html.
`...`	Optional arguments; currently unused.
`repartition`	The number of partitions used to distribute the generated table. Use 0 (the default) to avoid partitioning.
`memory`	Boolean; should the data be loaded eagerly into memory? (That is, should the table be cached?)
`overwrite`	Boolean; overwrite the table with the given name if it already exists?

Value

A tbl

Examples

library(sparklyr)
library(apache.sedona)

sc <- spark_connect(master = "spark://HOST:PORT")

if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your input file
  rdd <- spark_read_shapefile(sc, location = input_location)
}

[Package apache.sedona version 1.6.0 Index]

Read geospatial data into a Spark DataFrame.

Description

Usage

Arguments

Value

See Also

Examples