ParquetWriterProperties {arrow} | R Documentation |
ParquetWriterProperties class
Description
This class holds settings to control how a Parquet file is read by ParquetFileWriter.
Details
The parameters compression
, compression_level
, use_dictionary
and write_statistics' support various patterns:
The default
NULL
leaves the parameter unspecified, and the C++ library uses an appropriate default for each column (defaults listed above)A single, unnamed, value (e.g. a single string for
compression
) applies to all columnsAn unnamed vector, of the same size as the number of columns, to specify a value for each column, in positional order
A named vector, to specify the value for the named columns, the default value for the setting is used when not supplied
Unlike the high-level write_parquet, ParquetWriterProperties
arguments
use the C++ defaults. Currently this means "uncompressed" rather than
"snappy" for the compression
argument.
Factory
The ParquetWriterProperties$create()
factory method instantiates the object
and takes the following arguments:
-
table
: table to write (required) -
version
: Parquet version, "1.0" or "2.0". Default "1.0" -
compression
: Compression type, algorithm"uncompressed"
-
compression_level
: Compression level; meaning depends on compression algorithm -
use_dictionary
: Specify if we should use dictionary encoding. DefaultTRUE
-
write_statistics
: Specify if we should write statistics. DefaultTRUE
-
data_page_size
: Set a target threshold for the approximate encoded size of data pages within a column chunk (in bytes). Default 1 MiB.
See Also
Schema for information about schemas and metadata handling.