bigdist {bigdist} | R Documentation |
Computes distances via dist
and saves then
as file-backed matrix(FBM) using bigstatsr package or connects
existing FBM backup file on disk.
bigdist(mat, file, method = "euclidean", type = "float")
mat |
Numeric matrix. When missing, attempts to connect to existing backup file. See 'file' argument. |
file |
(string) Name of the backing file to be created or an existing backup file. Do not include trailing ".bk". See details for the backup file format. |
method |
(string or function) See method argument of
|
type |
(string, default: 'float') Storage type of FBM. See
|
bigdist class is a list where the key 'fbm' holds the FBM connection. The filename format is of the form <somename>_<size>_<type>.bk where size is the number of observations and type is the data type like 'double', 'float'.
bigstatsr package stores matrices on disk and allows efficient
computation on them. The disto provides a unified frontend to read
parts of distance matrices and apply functions over rows/columns. For
efficient operations, write C++ functions to talk to bigstatsr's
FBM
.
The distance computation and writing to FBM may be parallelized by setting a future backend
An object of class 'bigdist'.
# basics of 'bigdist'
# create a random matrix
set.seed(1)
amat <- matrix(rnorm(1e3), ncol = 10)
td <- tempdir()
# create a bigdist object with FBM (file-backed matrix) on disk
temp <- bigdist(mat = amat, file = file.path(td, "temp_ex1"))
temp
temp$fbm$backingfile
temp$fbm[1, 2]
# connect to FBM on disk as a bigdist object
temp2 <- bigdist(file = file.path(td, "temp_ex1_100_float"))
temp2
temp2$fbm[1,2]
# check the size of bigdist object
bigdist_size(temp)
# bigdist accessors
# ij
bigdist_extract(temp, 1, 2)
bigdist_extract(temp, 1:2, 3:4)
bigdist_extract(temp, 1:2, 3:4, product = "inner")
dim(bigdist_extract(temp, 1:2,))
dim(bigdist_extract(temp, , 3:4))
# k (lower trianle indexing)
bigdist_extract(temp, k = 3:7)
# bigdist replacers
# ij
bigdist_replace(temp, 1, 2, 10)
bigdist_extract(temp, 1, 2)
bigdist_replace(temp, 1:2, 3:4, 11:12)
bigdist_extract(temp, 1:2, 3:4, product = "inner")
# k (lower trianle indexing)
bigdist_replace(temp, k = 3:7, value = 51:55)
bigdist_extract(temp, k = 3:7)
# subset a bigdist object
temp_subset <- bigdist_subset(temp, index = 21:30, file = file.path(td, "temp_ex2"))
temp_subset
temp_subset$fbm$backingfile
# convert a dist object(in memory) to a bigdist object
temp3 <- as_bigdist(dist(mtcars), file = file.path(td, "temp_ex3"))
temp3