Sketch {chickn} | R Documentation |
The data sketch computation.
Sketch(Data, W, ind.col = 1:ncol(Data), ncores = 1, parallel = FALSE)
Data |
A Filebacked Big Matrix n x N. Data signals are stored in the matrix columns. |
W |
A frequency matrix m x n. The frequency vectors are stored in the matrix rows. |
ind.col |
Column indeces for which the data sketch is computed. By default all matrix columns. |
ncores |
Number of used cores. By default 1. If |
parallel |
logical parameter that indicates whether computations are performed on several cores in parallel or not. |
The sketch of the given data collection x_1, …, x_N is a vector of the length 2m. First m components of the data sketch vector correspond to its real part, i.e. \frac{1}{N} ∑_{i=1}^N \cos(W x_i). Last m components are its imaginary part, i.e. \frac{1}{N} ∑_{i=1}^N \sin(W x_i).
The data sketch vector.
Keriven N, Bourrier A, Gribonval R, PĂ©rez P (2018). “Sketching for large-scale learning of mixture models.” Information and Inference: A Journal of the IMA, 7(3), 447–508..
X = matrix(rnorm(1000), ncol=100, nrow = 10) X_FBM = bigstatsr::FBM(init = X, ncol=100, nrow = 10) W = GenerateFrequencies(Data = X_FBM, m = 20, N0 = 100, TypeDist = "AR")$W SK1 = Sketch(X_FBM, W) SK2 = Sketch(X_FBM, W, parallel = TRUE, ncores = 2) all.equal(SK1, SK2)