demodata {PsychWordVec}R Documentation

Demo data (pre-trained using word2vec on Google News; 8000 vocab, 300 dims).

Description

This demo data contains a sample of 8000 English words with 300-dimension word vectors pre-trained using the "word2vec" algorithm based on the Google News corpus. Most of these words are from the Top 8000 frequent wordlist, whereas a few are selected from less frequent words and appended.

Usage

data(demodata)

Format

A data.table (of new class wordvec) with two variables word and vec, transformed from the raw data (see the URL in Source) into .RData using the data_transform function.

Source

Google Code - word2vec (https://code.google.com/archive/p/word2vec/)

Examples

class(demodata)
demodata

embed = as_embed(demodata, normalize=TRUE)
class(embed)
embed


[Package PsychWordVec version 2023.9 Index]