as.matrix.paragraph2vec {doc2vec} | R Documentation |
Get the document or word vectors of a paragraph2vec model
Description
Get the document or word vectors of a paragraph2vec model as a dense matrix.
Usage
## S3 method for class 'paragraph2vec'
as.matrix(
x,
which = c("docs", "words"),
normalize = TRUE,
encoding = "UTF-8",
...
)
Arguments
x |
a paragraph2vec model as returned by |
which |
either one of 'docs' or 'words' |
normalize |
logical indicating to normalize the embeddings. Defaults to |
encoding |
set the encoding of the row names to the specified encoding. Defaults to 'UTF-8'. |
... |
not used |
Value
a matrix with the document or word vectors where the rownames are the documents or words upon which the model was trained
See Also
paragraph2vec
, read.paragraph2vec
Examples
library(tokenizers.bpe)
data(belgium_parliament, package = "tokenizers.bpe")
x <- subset(belgium_parliament, language %in% "french")
x <- subset(x, nchar(text) > 0 & txt_count_words(text) < 1000)
model <- paragraph2vec(x = x, type = "PV-DM", dim = 15, iter = 5)
model <- paragraph2vec(x = x, type = "PV-DBOW", dim = 100, iter = 20)
embedding <- as.matrix(model, which = "docs")
embedding <- as.matrix(model, which = "words")
embedding <- as.matrix(model, which = "docs", normalize = FALSE)
embedding <- as.matrix(model, which = "words", normalize = FALSE)
[Package doc2vec version 0.2.0 Index]