R: Compute word (or compound) plausibility

plausibility {LSAfun}

R Documentation

Compute word (or compound) plausibility

Description

Gives measures of semantic transparency (plausibility) for words or compounds

Usage

plausibility(x,method, n=10,stem,tvectors=tvectors)

Arguments

`x`	a character vector of `length(x) = 1` or a numeric of `length=ncol(tvectors)` vector with same dimensionality as LSA space
`method`	the measure of semantic transparency, can be one of `n_density`,`length`, `proximity`, or `entropy` (see Details)
`n`	the number of neighbors for the `n_density` method
`stem`	the stem (or word) of comparison for the `proximity` method
`tvectors`	the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

Details

The format of x should be of the kind x <- "word1 word2 word3" instead of x <- c("word1", "word2", "word3") if phrases of more than one word are used as input. Simple vector addition of the constituent vectors is then used to compute the phrase vector.

Since x can also be chosen to be any vector of the active LSA Space, this function can be combined with compose() to compute semantic transparency measures of complex expressions (see examples). Since semantic transparency methods were developed as measures for composed vectors, applying them makes most sense for those.

The methods are defined as follows:

method = "n_density" The average cosine between a (word or phrase) vector and its n nearest neighbors, excluding the word itself when a single word is submitted (see also SND for a more detailed version)
method = "length" The length of a vector (as computed by the standard Euclidean norm)
method = "proximity" The cosine similarity between a compound vector and its stem word (for example between mad hatter and hatter or between objectify and object)
method = "entropy" The entropy of the K-dimensional vector with the vector components t_1,...,t_K , as computed by

entropy = \log{K} - \sum{t_i * \log{t_i}}

Value

The semantic transparency as a numeric

Author(s)

Fritz Guenther

References

Lazaridou, A., Vecchi, E., & Baroni, M. (2013). Fish transporters and miracle homes: How compositional distributional semantics can help NP parsing. In Proceedings of EMNLP 2013 (pp. 1908 - 1913). Seattle, WA.

Marelli, M., & Baroni, M. (2015). Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122,. 485-515.

Vecchi, E. M., Baroni, M., & Zamparelli, R. (2011). (Linear) maps of the impossible: Capturing semantic anomalies in distributional space. In Proceedings of the ACL Workshop on Distributional Semantics and Compositionality (pp. 1-9). Portland, OR.

Examples

data(wonderland)

plausibility("cheshire cat",method="n_density",n=10,tvectors=wonderland) 

plausibility(compose("mad","hatter",method="Multiply",tvectors=wonderland),
method="proximity",stem="hatter",tvectors=wonderland)

[Package LSAfun version 0.7.1 Index]