create_lazyarray {lazyarray} | R Documentation |
Create a lazy-array with given format and dimension
Description
Create a directory to store lazy-array. The path must be missing. See load_lazyarray
for more details
Usage
create_lazyarray(
path,
storage_format,
dim,
dimnames = NULL,
compress_level = 50L,
prefix = "",
multipart = TRUE,
multipart_mode = 1,
file_names = NULL,
meta_name = "lazyarray.meta"
)
Arguments
path |
path to a local drive to store array data |
storage_format |
data type, choices are |
dim |
integer vector, dimension of array, see |
dimnames |
list of vectors, names of each dimension, see |
compress_level |
0 to 100, level of compression. 0 means no compression, 100 means maximum compression. For persistent data, it's recommended to set 100. Default is 50. |
prefix |
character prefix of array partition |
multipart |
whether to split array into multiple partitions, default is true |
multipart_mode |
1, or 2, mode of partition, see details. |
file_names |
data file names without prefix/extensions; see details. |
meta_name |
header file name, default is |
Details
Lazy array stores array into hard drive, and load them on
demand. It differs from other packages such as "bigmemory"
that the internal reading uses multi-thread, which gains significant
speed boost on solid state drives.
One lazy array contains two parts: data file(s) and a meta file. The data files can be stored in two ways: non-partitioned and partitioned.
For non-partitioned data array, the dimension is set at the creation of the array and cannot be mutable once created
For partitioned data array, there are also two partition modes,
defined by `multipart_mode`
. For mode 1, each partition
has the same dimension size as the array. The last dimension is 1
.
For example, a data with dimension c(2,3,5)
partitioned with mode 1 will have each partition dimension stored
with c(2,3,1)
. For mode 2, the last dimension will be dropped
when storing each partitions.
file_names
is used when irregular partition names should be used.
If multipart=FALSE
, the whole array is stored in a single file under
path
. The file name is <prefix><file_name>.fst
. For example,
by default prefix=""
, and file_name=""
, then path/.fst
stores the array data. If multipart=TRUE
, then file_names
should be a character vector of length equal to array's last dimension. A
3x4x5
array has 5 partitions, each partition name follows
<prefix><file_name>.fst
convention, and one can always use
arr$get_partition_fpath()
to find location of partition files.
For examples, see lazyarray
.
Value
A ClassLazyArray
instance
Author(s)
Zhengjia Wang