get_local_vocab {conText}R Documentation

Identify words common to a collection of texts and a set of pretrained embeddings.

Description

Local vocab consists of the intersect between the set of pretrained embeddings and the collection of texts.

Usage

get_local_vocab(context, pre_trained)

Arguments

context

(character) vector of contexts (usually context in get_context() output)

pre_trained

(numeric) a F x D matrix corresponding to pretrained embeddings. F = number of features and D = embedding dimensions. rownames(pre_trained) = set of features for which there is a pre-trained embedding.

Value

(character) vector of words common to the texts and pretrained embeddings.

Examples

# find local vocab (use it to define the candidate of nearest neighbors)
local_vocab <- get_local_vocab(cr_sample_corpus, pre_trained = cr_glove_subset)

[Package conText version 1.4.3 Index]