| GloVe {rsparse} | R Documentation |
Global Vectors
Description
Creates Global Vectors matrix factorization model
Public fields
componentsrepresents context embeddings
bias_ibias term i as per paper
bias_jbias term j as per paper
shufflelogical = FALSEby default. Whether to perform shuffling before each SGD iteration. Generally shuffling is a good practice for SGD.
Methods
Public methods
Method new()
Creates GloVe model object
Usage
GloVe$new( rank, x_max, learning_rate = 0.15, alpha = 0.75, lambda = 0, shuffle = FALSE, init = list(w_i = NULL, b_i = NULL, w_j = NULL, b_j = NULL) )
Arguments
rankdesired dimension for the latent vectors
x_maxintegermaximum number of co-occurrences to use in the weighting functionlearning_ratenumericlearning rate for SGD. I do not recommend that you modify this parameter, since AdaGrad will quickly adjust it to optimalalphanumeric = 0.75the alpha in weighting function formula :f(x) = 1 if x > x_max; else (x/x_max)^alphalambdanumeric = 0.0regularization parametershufflesee
shufflefieldinitlist(w_i = NULL, b_i = NULL, w_j = NULL, b_j = NULL)initialization for embeddings (w_i, w_j) and biases (b_i, b_j).w_i, w_j- numeric matrices, should have #rows = rank, #columns = expected number of rows (w_i) / columns(w_j) in the input matrix.b_i, b_j= numeric vectors, should have length of #expected number of rows(b_i) / columns(b_j) in input matrix
Method fit_transform()
fits model and returns embeddings
Usage
GloVe$fit_transform(
x,
n_iter = 10L,
convergence_tol = -1,
n_threads = getOption("rsparse_omp_threads", 1L),
...
)Arguments
xAn input term co-occurence matrix. Preferably in
dgTMatrixformatn_iterintegernumber of SGD iterationsconvergence_tolnumeric = -1defines early stopping strategy. Stop fitting when one of two following conditions will be satisfied: (a) passed all iterations (b)cost_previous_iter / cost_current_iter - 1 < convergence_tol.n_threadsnumber of threads to use
...not used at the moment
Method get_history()
returns value of the loss function for each epoch
Usage
GloVe$get_history()
Method clone()
The objects of this class are cloneable with this method.
Usage
GloVe$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
References
http://nlp.stanford.edu/projects/glove/
Examples
data('movielens100k')
co_occurence = crossprod(movielens100k)
glove_model = GloVe$new(rank = 4, x_max = 10, learning_rate = .25)
embeddings = glove_model$fit_transform(co_occurence, n_iter = 2, n_threads = 1)
embeddings = embeddings + t(glove_model$components) # embeddings + context embedings
identical(dim(embeddings), c(ncol(movielens100k), 10L))