json_to_parquet {parquetize}R Documentation

Convert a json file to parquet format

Description

This function allows to convert a json or ndjson file to parquet format.

Two conversions possibilities are offered :

Usage

json_to_parquet(
  path_to_file,
  path_to_parquet,
  format = "json",
  partition = "no",
  compression = "snappy",
  compression_level = NULL,
  ...
)

Arguments

path_to_file

String that indicates the path to the input file (don't forget the extension).

path_to_parquet

String that indicates the path to the directory where the parquet files will be stored.

format

string that indicates if the format is "json" (by default) or "ndjson"

partition

String ("yes" or "no" - by default) that indicates whether you want to create a partitioned parquet file. If "yes", '"partitioning"' argument must be filled in. In this case, a folder will be created for each modality of the variable filled in '"partitioning"'. Be careful, this argument can not be "yes" if 'max_memory' or 'max_rows' argument are not NULL.

compression

compression algorithm. Default "snappy".

compression_level

compression level. Meaning depends on compression algorithm.

...

additional format-specific arguments, see arrow::write_parquet() and arrow::write_dataset() for more informations.

Value

A parquet file, invisibly

Examples


# Conversion from a local json file to a single parquet file ::

json_to_parquet(
  path_to_file = system.file("extdata","iris.json",package = "parquetize"),
  path_to_parquet = tempfile(fileext = ".parquet")
)

# Conversion from a local ndjson file to a partitioned parquet file  ::

json_to_parquet(
  path_to_file = system.file("extdata","iris.ndjson",package = "parquetize"),
  path_to_parquet = tempfile(fileext = ".parquet"),
  format = "ndjson"
)

[Package parquetize version 0.5.7 Index]