MultipleChoice {LSAfun} | R Documentation |
Answers Multiple Choice Questions
Description
Selects the nearest word to an input out of a set of options
Usage
MultipleChoice(x,y,tvectors=tvectors,remove.punctuation=TRUE, stopwords = NULL,
method ="Add", all.results=FALSE)
Arguments
x |
a character vector of |
y |
a character vector specifying multiple answer options (with each element of the vector being one answer option) |
tvectors |
the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector) |
remove.punctuation |
removes punctuation from |
stopwords |
a character vector defining a list of words that are not used to compute the document/sentence vector for |
method |
the compositional model to compute the document vector from its word vectors. The default option |
all.results |
If |
Details
Computes all the cosines between a given sentence/document or word and multiple answer options. Then
selects the nearest option to the input (the option with the highest cosine). This function relies entirely on the costring
function.
A note will be displayed whenever not all words of one answer alternative are found in the semantic space. Caution: In that case, the function will still produce a result, by omitting the words not found in the semantic space. Depending on the specific requirements of a task, this may compromise the results. Please check your input when you receive this message.
A warning message will be displayed whenever no word of one answer alternative is found in the semantic space.
Using method="Analogy"
requires the input in both x
and y
to only consist of word pairs (for example x = c("helmet head")
and y = c("kneecap knee", "atmosphere earth", "grass field")
). In that case, the function will try to identify the best-fitting answer in y
by applying the king - man + woman = queen
rationale to solve man : king = woman : ? (Mikolov et al., 2013): In that case, one should also have king - man = queen - woman
. With method="Analogy"
, the function will compute the difference between the normalized vectors head - helmet
, and search the nearest of the vector differences knee - kneecap
, earth - atmosphere
, and field - grass
.
Value
If all.results=FALSE
(default), the function will only return the best answer as a character string. If all.results=TRUE
, it will return a named numeric vector, where the names are the different answer options in y
and the numeric values their respective cosine similarity to x
, sorted by decreasing similarity.
Author(s)
Fritz Guenther
References
Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211-240.
Mikolov, T., Yih, W. T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013). Association for Computational Linguistics.
See Also
cosine
,
Cosine
,
costring
,
multicostring
,
analogy
Examples
data(wonderland)
LSAfun:::MultipleChoice("who does the march hare celebrate his unbirthday with?",
c("mad hatter","red queen","caterpillar","cheshire Cat"),
tvectors=wonderland)