find_transformation {text2map} | R Documentation |
Find a specified matrix transformation
Description
Given a matrix, B
, of word embedding vectors (source) with
terms as rows, this function finds a transformed matrix following a
specified operation. These include: centering (i.e.
translation) and normalization (i.e. scaling). In the first, B
is
centered by subtracting column means. In the second, B
is
normalized by the L2 norm. Both have been found to improve
word embedding representations. The function also finds a transformed
matrix that approximately aligns B
, with another matrix,
A
, of word embedding vectors (reference), using Procrustes
transformation (see details). Finally, given a term-co-occurrence matrix
built on a local corpus, the function can "retrofit" pretrained
embeddings to better match the local corpus.
Usage
find_transformation(
wv,
ref = NULL,
method = c("align", "norm", "center", "retrofit")
)
Arguments
wv |
Matrix of word embedding vectors (a.k.a embedding model) with rows as terms (the source matrix to be transformed). |
ref |
If |
method |
Character vector indicating the method to use for the transformation. Current methods include: "align", "norm", "center", and "refrofit" – see details. |
Details
Aligning a source matrix of word embedding vectors, B
, to a
reference matrix, A
, has primarily been used as a post-processing step
for embeddings trained on longitudinal corpora for diachronic analysis
or for cross-lingual embeddings. Aligning preserves internal (cosine)
distances, while orient the source embeddings to minimize the sum of squared
distances (and is therefore a Least Squares problem).
Alignment is accomplished with the following steps:
translation: centering by column means
scaling: scale (normalizes) by the L2 Norm
rotation/reflection: rotates and a reflects to minimize sum of squared differences, using singular value decomposition
Alignment is asymmetrical, and only outputs the transformed source matrix,
B
. Therefore, it is typically recommended to align B
to A
,
and then A
to B
. However, simplying centering and norming
A
after may be sufficient.
Value
A new word embedding matrix, transformed using the specified method.
References
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. (2018).
'A robust self-learning method for fully unsupervised
cross-lingual mappings of word embeddings.' In Proceedings
of the 56th Annual Meeting of the Association for
Computational Linguistics. 789-798
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2019.
'An effective approach to unsupervised machine translation.'
In Proceedings of the 57th Annual Meeting of the Association
for Computational Linguistics. 194-203
Hamilton, William L., Jure Leskovec, and Dan Jurafsky. (2018).
'Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change.'
https://arxiv.org/abs/1605.09096v6.
Lin, Zefeng, Xiaojun Wan, and Zongming Guo. (2019).
'Learning Diachronic Word Embeddings with Iterative Stable
Information Alignment.' Natural Language Processing and
Chinese Computing. 749-60. doi:10.1007/978-3-030-32233-5_58.
Schlechtweg et al. (2019). 'A Wind of Change: Detecting and
Evaluating Lexical Semantic Change across Times and Domains.'
https://arxiv.org/abs/1906.02979v1.
Shoemark et a. (2019). 'Room to Glo: A Systematic Comparison
of Semantic Change Detection Approaches with Word Embeddings.'
Proceedings of the 2019 Conference on Empirical Methods in
Natural Language Processing. 66-76. doi:10.18653/v1/D19-1007
Borg and Groenen. (1997). Modern Multidimensional Scaling.
New York: Springer. 340-342