read_json_arrow {arrow}R Documentation

Read a JSON file

Description

Wrapper around JsonTableReader to read a newline-delimited JSON (ndjson) file into a data frame or Arrow Table.

Usage

read_json_arrow(
  file,
  col_select = NULL,
  as_data_frame = TRUE,
  schema = NULL,
  ...
)

Arguments

file

A character file name or URI, connection, literal data (either a single string or a raw vector), an Arrow input stream, or a FileSystem with path (SubTreeFileSystem).

If a file name, a memory-mapped Arrow InputStream will be opened and closed when finished; compression will be detected from the file extension and handled automatically. If an input stream is provided, it will be left open.

To be recognised as literal data, the input must be wrapped with I().

col_select

A character vector of column names to keep, as in the "select" argument to data.table::fread(), or a tidy selection specification of columns, as used in dplyr::select().

as_data_frame

Should the function return a tibble (default) or an Arrow Table?

schema

Schema that describes the table.

...

Additional options passed to JsonTableReader$create()

Details

If passed a path, will detect and handle compression from the file extension (e.g. .json.gz).

If schema is not provided, Arrow data types are inferred from the data:

When as_data_frame = TRUE, Arrow types are further converted to R types.

Value

A tibble, or a Table if as_data_frame = FALSE.

Examples


tf <- tempfile()
on.exit(unlink(tf))
writeLines('
    { "hello": 3.5, "world": false, "yo": "thing" }
    { "hello": 3.25, "world": null }
    { "hello": 0.0, "world": true, "yo": null }
  ', tf, useBytes = TRUE)

read_json_arrow(tf)

# Read directly from strings with `I()`
read_json_arrow(I(c('{"x": 1, "y": 2}', '{"x": 3, "y": 4}')))


[Package arrow version 16.1.0 Index]