text2vec-package | text2vec |
as.lda_c | Converts document-term matrix sparse matrix to 'lda_c' format |
BNS | BNS |
char_tokenizer | Simple tokenization functions for string splitting |
check_analogy_accuracy | Checks accuracy of word embeddings on the analogy task |
coherence | Coherence metrics for topic models |
Collocations | Collocations model. |
combine_vocabularies | Combines multiple vocabularies into one |
create_dtm | Document-term matrix construction |
create_dtm.itoken | Document-term matrix construction |
create_dtm.itoken_parallel | Document-term matrix construction |
create_tcm | Term-co-occurence matrix construction |
create_tcm.itoken | Term-co-occurence matrix construction |
create_tcm.itoken_parallel | Term-co-occurence matrix construction |
create_vocabulary | Creates a vocabulary of unique terms |
create_vocabulary.character | Creates a vocabulary of unique terms |
create_vocabulary.itoken | Creates a vocabulary of unique terms |
create_vocabulary.itoken_parallel | Creates a vocabulary of unique terms |
dist2 | Pairwise Distance Matrix Computation |
distances | Pairwise Distance Matrix Computation |
GlobalVectors | re-export rsparse::GloVe |
GloVe | re-export rsparse::GloVe |
hash_vectorizer | Vocabulary and hash vectorizers |
idir | Creates iterator over text files from the disk |
ifiles | Creates iterator over text files from the disk |
ifiles_parallel | Creates iterator over text files from the disk |
itoken | Iterators (and parallel iterators) over input objects |
itoken.character | Iterators (and parallel iterators) over input objects |
itoken.iterator | Iterators (and parallel iterators) over input objects |
itoken.list | Iterators (and parallel iterators) over input objects |
itoken_parallel | Iterators (and parallel iterators) over input objects |
itoken_parallel.character | Iterators (and parallel iterators) over input objects |
itoken_parallel.iterator | Iterators (and parallel iterators) over input objects |
itoken_parallel.list | Iterators (and parallel iterators) over input objects |
jsPCA_robust | (numerically robust) Dimension reduction via Jensen-Shannon Divergence & Principal Components |
LatentDirichletAllocation | Creates Latent Dirichlet Allocation model. |
LatentSemanticAnalysis | Latent Semantic Analysis model |
LDA | Creates Latent Dirichlet Allocation model. |
LSA | Latent Semantic Analysis model |
movie_review | IMDB movie reviews |
normalize | Matrix normalization |
pdist2 | Pairwise Distance Matrix Computation |
perplexity | Perplexity of a topic model |
postag_lemma_tokenizer | Simple tokenization functions for string splitting |
prepare_analogy_questions | Prepares list of analogy questions |
print.text2vec_vocabulary | Printing Vocabulary |
prune_vocabulary | Prune vocabulary |
psim2 | Pairwise Similarity Matrix Computation |
RelaxedWordMoversDistance | Creates Relaxed Word Movers Distance (RWMD) model |
RWMD | Creates Relaxed Word Movers Distance (RWMD) model |
sim2 | Pairwise Similarity Matrix Computation |
similarities | Pairwise Similarity Matrix Computation |
space_tokenizer | Simple tokenization functions for string splitting |
split_into | Split a vector for parallel processing |
text2vec | text2vec |
TfIdf | TfIdf |
tokenizers | Simple tokenization functions for string splitting |
vectorizers | Vocabulary and hash vectorizers |
vocabulary | Creates a vocabulary of unique terms |
vocab_vectorizer | Vocabulary and hash vectorizers |
word_tokenizer | Simple tokenization functions for string splitting |