Nystrom_kernel {chickn} | R Documentation |
An implementation of the Nystrom kernel approximation method.
Nystrom_kernel( Data, c, l, s, gamma = NULL, max_neighbors = 32, DIR_output = tempfile(), DIR_save = tempfile(), ncores = 2, ncores_svd = 1, distance_type = "W1", kernel_type = "Gaussian", verbose = FALSE )
Data |
A Filebacked Big Matrix n x N. Data vectors are stored in the matrix columns. |
c |
Number of columns selected for the approximation. |
l |
An intermediate rank l < c. |
s |
A target rank s < l. |
gamma |
Kernel parameter. If it is NULL (default), the parameter is estimated using |
max_neighbors |
Number of neigbors selected for the paramenter estimation. |
DIR_output |
A directory for intermediate computations. |
DIR_save |
A directory to save the result. |
ncores |
Number of cores. Default is 2. |
ncores_svd |
Number of cores used for the SVD computaion. It is recommended to use 1 core (default). |
distance_type |
Distance function type. The available types are Wasserstein-1 ('W1') and Euclidean ('Euclide'). The default value is 'W1'. |
kernel_type |
Kernel function type c('Gaussian', 'Laplacian'). |
verbose |
logical that indicates whether dysplay the processing steps. |
Nystrom method consists in approximating the kernel matrix K by C W^{-1} C^{\top}, with
C \in R^{N \times c} obtained from K by randomly selecting only c
columns and
W \in R^{c \times c} obtained from C by selecting as well c
corresponding rows.
The kernel function, based on the distance metric, is given as follows: k(x_i,x_j) = e^{- gamma \cdot d^p(x_i,x_j)},
where p is equal to 1 for 'Laplacian' kernel and equal to 2 for 'Gaussian' kernel and
where d(x_i,x_j) is the distance between data vectors x_i and x_j.
A list with the following attributes:
K_W1
is the Filebacked Big Matrix of the Nystrom kernel approximation.
gamma
is the estimated kernel parameter.
RandomSample
is the data vector indices, selected for the Nystrom approximation.
This is an implemetation of the Nystrom kernel approximation method proposed in Wang S, Gittens A, Mahoney MW (2019). “Scalable kernel K-means clustering with NystrÃ¶m approximation: relative-error bounds.” The Journal of Machine Learning Research, 20(1), 431–479..
W1_parallel
, gamma_estimation
, big_randomSVD
, cumsum_parallel
.
X = matrix(rnorm(2000), ncol=100, nrow = 20) X_FBM = bigstatsr::FBM(init = X, ncol=100, nrow = 20) output = Nystrom_kernel(Data = X_FBM, c = 10, l = 7, s = 5, max_neighbors = 3, ncores = 2)