embedding_glove {textdata} | R Documentation |
Global Vectors for Word Representation
Description
The GloVe pre-trained word vectors provide word embeddings created using varying numbers of tokens.
Usage
embedding_glove6b(
dir = NULL,
dimensions = c(50, 100, 200, 300),
delete = FALSE,
return_path = FALSE,
clean = FALSE,
manual_download = FALSE
)
embedding_glove27b(
dir = NULL,
dimensions = c(25, 50, 100, 200),
delete = FALSE,
return_path = FALSE,
clean = FALSE,
manual_download = FALSE
)
embedding_glove42b(
dir = NULL,
delete = FALSE,
return_path = FALSE,
clean = FALSE,
manual_download = FALSE
)
embedding_glove840b(
dir = NULL,
delete = FALSE,
return_path = FALSE,
clean = FALSE,
manual_download = FALSE
)
Arguments
dir |
Character, path to directory where data will be stored. If
|
dimensions |
A number indicating the number of vectors to include. One of 50, 100, 200, or 300 for glove6b, or one of 25, 50, 100, or 200 for glove27b. |
delete |
Logical, set |
return_path |
Logical, set |
clean |
Logical, set |
manual_download |
Logical, set |
Details
Citation info:
InProceedings{pennington2014glove,
author = {Jeffrey Pennington and Richard Socher and Christopher D.
Manning},
title = {GloVe: Global Vectors for Word Representation},
booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
year = 2014
pages = {1532-1543}
url = {http://www.aclweb.org/anthology/D14-1162}
}
Value
A tibble with 400k, 1.9m, 2.2m, or 1.2m rows (one row for each unique token in the vocabulary) and the following variables:
- token
An individual token (usually a word)
- d1, d2, etc
The embeddings for that token.
Source
https://nlp.stanford.edu/projects/glove/
References
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation.
Examples
## Not run:
embedding_glove6b(dimensions = 50)
# Custom directory
embedding_glove42b(dir = "data/")
# Deleting dataset
embedding_glove6b(delete = TRUE, dimensions = 300)
# Returning filepath of data
embedding_glove840b(return_path = TRUE)
## End(Not run)