RecordBatch class


A record batch is a collection of equal-length arrays matching a particular Schema. It is a table-like data structure that is semantically a sequence of fields, each a contiguous Arrow Array.


record_batch(..., schema = NULL)



A data.frame or a named set of Arrays or vectors. If given a mixture of data.frames and vectors, the inputs will be autospliced together (see examples). Alternatively, you can provide a single Arrow IPC InputStream, Message, Buffer, or R raw object containing a Buffer.


a Schema, or NULL (the default) to infer the schema from the data in .... When providing an Arrow IPC buffer, schema is required.

S3 Methods and Usage

Record batches are data-frame-like, and many methods you expect to work on a data.frame are implemented for RecordBatch. This includes [, [[, $, names, dim, nrow, ncol, head, and tail. You can also pull the data from an Arrow record batch into R with See the examples.

A caveat about the $ method: because RecordBatch is an R6 object, $ is also used to access the object's methods (see below). Methods take precedence over the table's columns. So, batch$Slice would return the "Slice" method function even if there were a column in the table called "Slice".

R6 Methods

In addition to the more R-friendly S3 methods, a RecordBatch object has the following R6 methods that map onto the underlying C++ methods:

There are also some active bindings


batch <- record_batch(name = rownames(mtcars), mtcars)
batch[["cyl"]][4:8, c("gear", "hp", "wt")])

