Sketch {chickn} R Documentation

Sketch

Description

The data sketch computation.

Usage

Sketch(Data, W, ind.col = 1:ncol(Data), ncores = 1, parallel = FALSE)


Arguments

 Data A Filebacked Big Matrix n x N. Data signals are stored in the matrix columns. W A frequency matrix m x n. The frequency vectors are stored in the matrix rows. ind.col Column indeces for which the data sketch is computed. By default all matrix columns. ncores Number of used cores. By default 1. If parallel = FALSE, ncores defines a number of data splits on which the sketch is computed separatelly. parallel logical parameter that indicates whether computations are performed on several cores in parallel or not.

Details

The sketch of the given data collection x_1, …, x_N is a vector of the length 2m. First m components of the data sketch vector correspond to its real part, i.e. \frac{1}{N} ∑_{i=1}^N \cos(W x_i). Last m components are its imaginary part, i.e. \frac{1}{N} ∑_{i=1}^N \sin(W x_i).

Value

The data sketch vector.

References

Keriven N, Bourrier A, Gribonval R, Pérez P (2018). “Sketching for large-scale learning of mixture models.” Information and Inference: A Journal of the IMA, 7(3), 447–508..

Examples

X = matrix(rnorm(1000), ncol=100, nrow = 10)
X_FBM = bigstatsr::FBM(init = X, ncol=100, nrow = 10)
W = GenerateFrequencies(Data = X_FBM, m = 20, N0 = 100, TypeDist = "AR")\$W
SK1 = Sketch(X_FBM, W)
SK2 = Sketch(X_FBM, W, parallel = TRUE, ncores = 2)
all.equal(SK1, SK2)


[Package chickn version 1.2.3 Index]