dem_group {conText} | R Documentation |
Average document-embeddings in a dem by a grouping variable
Description
Average embeddings in a dem by a grouping variable, by averaging over columns within groups
and creating new "documents" with the group labels.
Similar in essence to dfm_group
.
Usage
dem_group(x, groups = NULL)
Arguments
x |
a ( |
groups |
a character or factor variable equal in length to the number of documents |
Value
a G x D (dem-class
) document-embedding-matrix corresponding to the ALC embeddings for each group.
G = number of unique groups defined in the groups
variable, D = dimensions of pretrained embeddings.
Examples
library(quanteda)
# tokenize corpus
toks <- tokens(cr_sample_corpus)
# build a tokenized corpus of contexts sorrounding a target term
immig_toks <- tokens_context(x = toks, pattern = "immigr*", window = 6L)
# build document-feature matrix
immig_dfm <- dfm(immig_toks)
# construct document-embedding-matrix
immig_dem <- dem(immig_dfm, pre_trained = cr_glove_subset,
transform = TRUE, transform_matrix = cr_transform, verbose = FALSE)
# to get group-specific embeddings, average within party
immig_wv_party <- dem_group(immig_dem,
groups = immig_dem@docvars$party)
[Package conText version 1.4.3 Index]