h2o.exportFile {h2o} | R Documentation |
Export an H2O Data Frame (H2OFrame) to a File or to a collection of Files.
Description
Exports an H2OFrame (which can be either VA or FV) to a file. This file may be on the H2O instace's local filesystem, or to HDFS (preface the path with hdfs://) or to S3N (preface the path with s3n://).
Usage
h2o.exportFile(
data,
path,
force = FALSE,
sep = ",",
compression = NULL,
parts = 1,
header = TRUE,
quote_header = TRUE,
format = "csv",
write_checksum = TRUE
)
Arguments
data |
An H2OFrame object. |
path |
The path to write the file to. Must include the directory and also filename if exporting to a single file. May be prefaced with hdfs:// or s3n://. Each row of data appears as line of the file. |
force |
logical, indicates how to deal with files that already exist. |
sep |
The field separator character. Values on each line of the file will be separated by this character (default ","). |
compression |
How to compress the exported dataset (default none; gzip, bzip2 and snappy available) |
parts |
integer, number of part files to export to. Default is to write to a single file. Large data can be exported to multiple 'part' files, where each part file contains subset of the data. User can specify the maximum number of part files or use value -1 to indicate that H2O should itself determine the optimal number of files. Parameter path will be considered to be a path to a directory if export to multiple part files is desired. Part files conform to naming scheme 'part-m-?????'. |
header |
logical, indicates whether to write the header line. Default is to include the header in the output file. |
quote_header |
logical, indicates whether column names should be quoted. Default is to use quotes. |
format |
string, one of "csv" or "parquet". Default is "csv". Export to parquet is multipart and H2O itself determines the optimal number of files (1 file per chunk). |
write_checksum |
logical, if supported by the format (e.g. 'parquet'), export will include a checksum file for each exported data file. |
Details
In the case of existing files force = TRUE
will overwrite the file.
Otherwise, the operation will fail.
Examples
## Not run:
library(h2o)
h2o.init()
iris_hf <- as.h2o(iris)
# These aren't real paths
# h2o.exportFile(iris_hf, path = "/path/on/h2o/server/filesystem/iris.csv")
# h2o.exportFile(iris_hf, path = "hdfs://path/in/hdfs/iris.csv")
# h2o.exportFile(iris_hf, path = "s3n://path/in/s3/iris.csv")
## End(Not run)