write_hdd {hdd} | R Documentation |
Saves or appends a data set into a HDD file
Description
This function saves in-memory/HDD data sets into HDD repositories. Useful to append several data sets.
Usage
write_hdd(
x,
dir,
chunkMB = Inf,
rowsPerChunk,
compress = 50,
add = FALSE,
replace = FALSE,
showWarning,
...
)
Arguments
x |
A data set. |
dir |
The HDD repository, i.e. the directory where the HDD data is. |
chunkMB |
If the data has to be split in several files of |
rowsPerChunk |
Integer, default is missing. Alternative to the argument
|
compress |
Compression rate to be applied by |
add |
Should the file be added to the existing repository? Default is |
replace |
If |
showWarning |
If the data |
... |
Not currently used. |
Details
Creating a HDD data set with this function always create an additional file named
“_hdd.txt” in the HDD folder. This file contains summary information on
the data: the number of rows, the number of variables, the first five lines and
a log of how the HDD data set has been created. To access the log directly from
R
, use the function origin
.
Value
This function does not return anything in R. Instead it creates a folder
on disk containing .fst
files. These files represent the data that has been
converted to the hdd
format.
You can then read the created data with the function hdd()
.
Author(s)
Laurent Berge
See Also
See hdd
, sub-.hdd
and cash-.hdd
for the extraction and manipulation of out of memory data. For importation of
HDD data sets from text files: see txt2hdd
.
See hdd_slice
to apply functions to chunks of data (and create
HDD objects) and hdd_merge
to merge large files.
To create/reshape HDD objects from memory or from other HDD objects, see
write_hdd
.
To display general information from HDD objects: origin
,
summary.hdd
, print.hdd
,
dim.hdd
and names.hdd
.
Examples
# Toy example with iris data
# Let's create a HDD data set from iris data
hdd_path = tempfile() # => folder where the data will be saved
write_hdd(iris, hdd_path)
# Let's add data to it
for(i in 1:10) write_hdd(iris, hdd_path, add = TRUE)
base_hdd = hdd(hdd_path)
summary(base_hdd) # => 11 files, 1650 lines, 48.7KB on disk
# Let's save the iris data by chunks of 1KB
# we use replace = TRUE to delete the previous data
write_hdd(iris, hdd_path, chunkMB = 0.001, replace = TRUE)
base_hdd = hdd(hdd_path)
summary(base_hdd) # => 8 files, 150 lines, 10.2KB on disk