sedona_read_dsv_to_typed_rdd {apache.sedona}R Documentation

Create a typed SpatialRDD from a delimiter-separated values data source.

Description

Create a typed SpatialRDD (namely, a PointRDD, a PolygonRDD, or a LineStringRDD) from a data source containing delimiter-separated values. The data source can contain spatial attributes (e.g., longitude and latidude) and other attributes. Currently only inputs with spatial attributes occupying a contiguous range of columns (i.e., [first_spatial_col_index, last_spatial_col_index]) are supported.

Usage

sedona_read_dsv_to_typed_rdd(
  sc,
  location,
  delimiter = c(",", "\t", "?", "'", "\"", "_", "-", "%", "~", "|", ";"),
  type = c("point", "polygon", "linestring"),
  first_spatial_col_index = 0L,
  last_spatial_col_index = NULL,
  has_non_spatial_attrs = TRUE,
  storage_level = "MEMORY_ONLY",
  repartition = 1L
)

Arguments

sc

A spark_connection.

location

Location of the data source.

delimiter

Delimiter within each record. Must be one of ',', '\t', '?', '\”, '"', '_', '-', '%', '~', '|', ';'

type

Type of the SpatialRDD (must be one of "point", "polygon", or "linestring".

first_spatial_col_index

Zero-based index of the left-most column containing spatial attributes (default: 0).

last_spatial_col_index

Zero-based index of the right-most column containing spatial attributes (default: NULL). Note last_spatial_col_index does not need to be specified when creating a PointRDD because it will automatically have the implied value of (first_spatial_col_index + 1). For all other types of RDDs, if last_spatial_col_index is unspecified, then it will assume the value of -1 (i.e., the last of all input columns).

has_non_spatial_attrs

Whether the input contains non-spatial attributes.

storage_level

Storage level of the RDD (default: MEMORY_ONLY).

repartition

The minimum number of partitions to have in the resulting RDD (default: 1).

Value

A typed SpatialRDD.

See Also

Other Sedona RDD data interface functions: sedona_read_geojson(), sedona_read_shapefile_to_typed_rdd(), sedona_save_spatial_rdd(), sedona_write_wkb()

Examples

library(sparklyr)
library(apache.sedona)

sc <- spark_connect(master = "spark://HOST:PORT")

if (!inherits(sc, "test_connection")) {
  input_location <- "/dev/null" # replace it with the path to your csv file
  rdd <- sedona_read_dsv_to_typed_rdd(
    sc,
    location = input_location,
    delimiter = ",",
    type = "point",
    first_spatial_col_index = 1L
  )
}


[Package apache.sedona version 1.5.1 Index]