flply {fplyr} | R Documentation |
Read, process each block and return a list
Description
With flply()
you can apply a function to each block of the file separately.
The result of each function is saved into a list and returned. flply()
is similar to lapply()
, except that it applies the function to each
block of the file rather than to each element of a list. It is also similar
to by()
, except that it does not read the whole file into memory, but
each block is processed as soon as it is read from the disk.
Usage
flply(
input,
FUN,
...,
key.sep = "\t",
sep = "\t",
skip = 0,
header = TRUE,
nblocks = Inf,
stringsAsFactors = FALSE,
colClasses = NULL,
select = NULL,
drop = NULL,
col.names = NULL,
parallel = 1
)
Arguments
input |
Path of the input file. |
FUN |
A function to be applied to each block. The first argument to the
function must be a |
... |
Additional arguments to be passed to FUN. |
key.sep |
The character that delimits the first field from the rest. |
sep |
The field delimiter (often equal to |
skip |
Number of lines to skip at the beginning of the file |
header |
Whether the file has a header. |
nblocks |
The number of blocks to read. |
stringsAsFactors |
Whether to convert strings into factors. |
colClasses |
Vector or list specifying the class of each field. |
select |
The columns (names or numbers) to be read. |
drop |
The columns (names or numbers) not to be read. |
col.names |
Names of the columns. |
parallel |
Number of cores to use. |
Value
Returns a list containing, for each chunk, the result of the processing.
Slogan
flply: from file to list
Examples
f <- system.file("extdata", "dt_iris.csv", package = "fplyr")
# Compute, within each block, the correlation between Sepal.Length and Petal.Length
flply(f, function(d) cor(d$Sepal.Length, d$Petal.Length))
# Summarise each block
flply(f, summary)
# Make a different linear model for each block
block.lm <- function(d) {
lm(Sepal.Length ~ ., data = d[, !"Species"])
}
lm.list <- flply(f, block.lm)