spark_read_tfrecord {sparktf} | R Documentation |
Read a TFRecord File
Description
Read a TFRecord file as a Spark DataFrame.
Usage
spark_read_tfrecord(sc, name = NULL, path = name, schema = NULL,
record_type = c("Example", "SequenceExample"), overwrite = TRUE)
Arguments
sc |
A spark conneciton. |
name |
The name to assign to the newly generated table or the path to the file. Note that if a path is provided for the 'name' argument then one cannot specify a name. |
path |
The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3a://" and "file://" protocols. |
schema |
(Currently unsupported.) Schema of TensorFlow records. If not provided, the schema is inferred from TensorFlow records. |
record_type |
Input format of TensorFlow records. By default it is Example. |
overwrite |
Boolean; overwrite the table with the given name if it already exists? |
Examples
## Not run:
iris_tbl <- copy_to(sc, iris)
data_path <- file.path(tempdir(), "iris")
df1 <- iris_tbl %>%
ft_string_indexer_model(
"Species", "label",
labels = c("setosa", "versicolor", "virginica")
)
df1 %>%
spark_write_tfrecord(
path = data_path,
write_locality = "local"
)
spark_read_tfrecord(sc, data_path)
## End(Not run)
[Package sparktf version 0.1.0 Index]