text2vec-package |
text2vec |
as.lda_c |
Converts document-term matrix sparse matrix to 'lda_c' format |
char_tokenizer |
Simple tokenization functions for string splitting |
check_analogy_accuracy |
Checks accuracy of word embeddings on the analogy task |
coherence |
Coherence metrics for topic models |
Collocations |
Collocations model. |
combine_vocabularies |
Combines multiple vocabularies into one |
create_dtm |
Document-term matrix construction |
create_dtm.itoken |
Document-term matrix construction |
create_dtm.itoken_parallel |
Document-term matrix construction |
create_tcm |
Term-co-occurence matrix construction |
create_tcm.itoken |
Term-co-occurence matrix construction |
create_tcm.itoken_parallel |
Term-co-occurence matrix construction |
create_vocabulary |
Creates a vocabulary of unique terms |
create_vocabulary.character |
Creates a vocabulary of unique terms |
create_vocabulary.itoken |
Creates a vocabulary of unique terms |
create_vocabulary.itoken_parallel |
Creates a vocabulary of unique terms |
dist2 |
Pairwise Distance Matrix Computation |
distances |
Pairwise Distance Matrix Computation |
GlobalVectors |
re-export rsparse::GloVe |
GloVe |
re-export rsparse::GloVe |
hash_vectorizer |
Vocabulary and hash vectorizers |
idir |
Creates iterator over text files from the disk |
ifiles |
Creates iterator over text files from the disk |
ifiles_parallel |
Creates iterator over text files from the disk |
itoken |
Iterators (and parallel iterators) over input objects |
itoken.character |
Iterators (and parallel iterators) over input objects |
itoken.iterator |
Iterators (and parallel iterators) over input objects |
itoken.list |
Iterators (and parallel iterators) over input objects |
itoken_parallel |
Iterators (and parallel iterators) over input objects |
itoken_parallel.character |
Iterators (and parallel iterators) over input objects |
itoken_parallel.iterator |
Iterators (and parallel iterators) over input objects |
itoken_parallel.list |
Iterators (and parallel iterators) over input objects |
jsPCA_robust |
(numerically robust) Dimension reduction via Jensen-Shannon Divergence & Principal Components |
LatentDirichletAllocation |
Creates Latent Dirichlet Allocation model. |
LatentSemanticAnalysis |
Latent Semantic Analysis model |
Creates Latent Dirichlet Allocation model. |
Latent Semantic Analysis model |
movie_review |
IMDB movie reviews |
normalize |
Matrix normalization |
pdist2 |
Pairwise Distance Matrix Computation |
perplexity |
Perplexity of a topic model |
postag_lemma_tokenizer |
Simple tokenization functions for string splitting |
prepare_analogy_questions |
Prepares list of analogy questions |
print.text2vec_vocabulary |
Printing Vocabulary |
prune_vocabulary |
Prune vocabulary |
psim2 |
Pairwise Similarity Matrix Computation |
RelaxedWordMoversDistance |
Creates Relaxed Word Movers Distance (RWMD) model |
Creates Relaxed Word Movers Distance (RWMD) model |
sim2 |
Pairwise Similarity Matrix Computation |
similarities |
Pairwise Similarity Matrix Computation |
space_tokenizer |
Simple tokenization functions for string splitting |
split_into |
Split a vector for parallel processing |
text2vec |
text2vec |
TfIdf |
TfIdf |
tokenizers |
Simple tokenization functions for string splitting |
vectorizers |
Vocabulary and hash vectorizers |
vocabulary |
Creates a vocabulary of unique terms |
vocab_vectorizer |
Vocabulary and hash vectorizers |
word_tokenizer |
Simple tokenization functions for string splitting |