bsubset {bread} | R Documentation |
Pre-subsets rows of a data file by index number before loading it in memory
Description
Simple wrapper for data.table::fread() allowing to subset rows of data from a file with the Unix 'sed' or 'awk' commands. This method is useful if you want to load a file too large for your available memory (and encounter the 'cannot allocate vector of size' error). You need to input the index number of the first and last rows you want to load in memory with fread(), or alternatively use either the head or tail arguments to subset the first or last rows of the file.
Usage
bsubset(
file = NULL,
head = NULL,
tail = NULL,
first_row = NULL,
last_row = NULL,
...
)
Arguments
file |
String. Full path to a file |
head |
Numeric. How many rows starting from the first in the file. |
tail |
Numeric. How many rows starting from the last in the file. |
first_row |
Numeric. First row of the portion of the file to subset. |
last_row |
Numeric. Last row of the portion of the file to subset. |
... |
Arguments that must be passed to data.table::fread() like 'sep'. |
Value
A dataframe containing the subsetted rows
Examples
file <- system.file('extdata', 'test.csv', package = 'bread')
## Head or Tail... for the first n or last n rows
bsubset(file = file, head = 5)
## Subset from the middle of a file
bsubset(file = file, first_row = 5, last_row = 10)
## first_row defaults as 1 and last_row as the last row of the file
bsubset(file = file, first_row = 5)
bsubset(file = file, last_row = 10)