get_wordvec {PsychWordVec}R Documentation

Extract word vector(s).

Description

Extract word vector(s), using either a list of words or a regular expression.

Usage

get_wordvec(
  data,
  words = NULL,
  pattern = NULL,
  plot = FALSE,
  plot.dims = NULL,
  plot.step = 0.05,
  plot.border = "white"
)

Arguments

data

A wordvec (data.table) or embed (matrix), see data_wordvec_load.

words

[Option 1] Character string(s).

pattern

[Option 2] Regular expression (see str_subset). If neither words nor pattern are specified (i.e., both are NULL), then all words in the data will be extracted.

plot

Generate a plot to illustrate the word vectors? Defaults to FALSE.

plot.dims

Dimensions to be plotted (e.g., 1:100). Defaults to NULL (plot all dimensions).

plot.step

Step for value breaks. Defaults to 0.05.

plot.border

Color of tile border. Defaults to "white". To remove the border color, set plot.border=NA.

Value

A data.table with words as columns and dimensions as rows.

Download

Download pre-trained word vectors data (.RData): https://psychbruce.github.io/WordVector_RData.pdf

See Also

data_wordvec_subset

plot_wordvec

plot_wordvec_tSNE

Examples

d = as_embed(demodata, normalize=TRUE)

get_wordvec(d, c("China", "Japan", "Korea"))
get_wordvec(d, cc(" China, Japan; Korea "))

## specify `pattern`:
get_wordvec(d, pattern="Chin[ae]|Japan|Korea")

## plot word vectors:
get_wordvec(d, cc("China, Japan, Korea,
                   Mac, Linux, Windows"),
            plot=TRUE, plot.dims=1:100)

## a more complex example:

words = cc("
China
Chinese
Japan
Japanese
good
bad
great
terrible
morning
evening
king
queen
man
woman
he
she
cat
dog
")

dt = get_wordvec(
  d, words,
  plot=TRUE,
  plot.dims=1:100,
  plot.step=0.06)

# if you want to change something:
attr(dt, "ggplot") +
  scale_fill_viridis_b(n.breaks=10, show.limits=TRUE) +
  theme(legend.key.height=unit(0.1, "npc"))

# or to save the plot:
ggsave(attr(dt, "ggplot"),
       filename="wordvecs.png",
       width=8, height=5, dpi=500)
unlink("wordvecs.png")  # delete file for code check


[Package PsychWordVec version 2023.9 Index]