get_local_vocab {conText} | R Documentation |
Identify words common to a collection of texts and a set of pretrained embeddings.
Description
Local vocab consists of the intersect between the set of pretrained embeddings and the collection of texts.
Usage
get_local_vocab(context, pre_trained)
Arguments
context |
(character) vector of contexts (usually |
pre_trained |
(numeric) a F x D matrix corresponding to pretrained embeddings. F = number of features and D = embedding dimensions. rownames(pre_trained) = set of features for which there is a pre-trained embedding. |
Value
(character) vector of words common to the texts and pretrained embeddings.
Examples
# find local vocab (use it to define the candidate of nearest neighbors)
local_vocab <- get_local_vocab(cr_sample_corpus, pre_trained = cr_glove_subset)
[Package conText version 1.4.3 Index]