R: Find a specified matrix transformation

find_transformation {text2map}

R Documentation

Find a specified matrix transformation

Description

Given a matrix, B, of word embedding vectors (source) with terms as rows, this function finds a transformed matrix following a specified operation. These include: centering (i.e. translation) and normalization (i.e. scaling). In the first, B is centered by subtracting column means. In the second, B is normalized by the L2 norm. Both have been found to improve word embedding representations. The function also finds a transformed matrix that approximately aligns B, with another matrix, A, of word embedding vectors (reference), using Procrustes transformation (see details). Finally, given a term-co-occurrence matrix built on a local corpus, the function can "retrofit" pretrained embeddings to better match the local corpus.

Usage

find_transformation(
  wv,
  ref = NULL,
  method = c("align", "norm", "center", "retrofit")
)

Arguments

`wv`	Matrix of word embedding vectors (a.k.a embedding model) with rows as terms (the source matrix to be transformed).
`ref`	If `method = "align"`, this is the reference matrix toward which the source matrix is to be aligned.
`method`	Character vector indicating the method to use for the transformation. Current methods include: "align", "norm", "center", and "refrofit" – see details.

Details

Aligning a source matrix of word embedding vectors, B, to a reference matrix, A, has primarily been used as a post-processing step for embeddings trained on longitudinal corpora for diachronic analysis or for cross-lingual embeddings. Aligning preserves internal (cosine) distances, while orient the source embeddings to minimize the sum of squared distances (and is therefore a Least Squares problem). Alignment is accomplished with the following steps:

translation: centering by column means
scaling: scale (normalizes) by the L2 Norm
rotation/reflection: rotates and a reflects to minimize sum of squared differences, using singular value decomposition

Alignment is asymmetrical, and only outputs the transformed source matrix, B. Therefore, it is typically recommended to align B to A, and then A to B. However, simplying centering and norming A after may be sufficient.

Value

A new word embedding matrix, transformed using the specified method.

References

Mikel Artetxe, Gorka Labaka, and Eneko Agirre. (2018). 'A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings.' In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 789-798
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2019. 'An effective approach to unsupervised machine translation.' In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 194-203
Hamilton, William L., Jure Leskovec, and Dan Jurafsky. (2018). 'Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change.' https://arxiv.org/abs/1605.09096v6.
Lin, Zefeng, Xiaojun Wan, and Zongming Guo. (2019). 'Learning Diachronic Word Embeddings with Iterative Stable Information Alignment.' Natural Language Processing and Chinese Computing. 749-60. doi:10.1007/978-3-030-32233-5_58.
Schlechtweg et al. (2019). 'A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains.' https://arxiv.org/abs/1906.02979v1. Shoemark et a. (2019). 'Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings.' Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. 66-76. doi:10.18653/v1/D19-1007 Borg and Groenen. (1997). Modern Multidimensional Scaling. New York: Springer. 340-342

[Package text2map version 0.2.0 Index]