specificities {textometry} | R Documentation |
Calculate Lexical Specificity Score
Description
Calculate the specificity - or association or surprise -
score of a word being present f
times or more
in a sub-corpus of t
words given that it appears
a total of F
times in a whole corpus of T
words.
Usage
specificities(lexicaltable, types=NULL, parts=NULL)
Arguments
lexicaltable |
a complete lexical table, i.e. a numeric matrix where each line represents a word and each column a part of the corpus. Each cell gives the frequency of the given word in the corresponding part of the corpus. |
types |
list of rows (words) for which the specificity score must be calculated.
If |
parts |
list of columns (parts) for which the specificity score must be calculated.
If |
Value
Returns a matrix of nrow(lexicaltable) * ncol(lexicaltable)
(the number of
rows and columns may be reduced using types
or parts
), each cell
giving the specificity score.
Author(s)
Matthieu Decorde, Serge Heiden, Sylvain Loiseau, Lise Vaudor
References
Lafon P. (1980) Sur la variabilit\'e de la fr\'e quence des formes dans un corpus, Mots, 1, pp. 127–165. https://www.persee.fr/doc/mots_0243-6450_1980_num_1_1_1008
See Also
specificities.probabilities
, specificities.lexicon
Examples
data(robespierre);
spe <- specificities(robespierre);
string <- paste("The word %s appears f=%d times in a sub-corpus of t=%d words,",
" given a total frequency of F=%d in the robespierre corpus made",
" of T=%d words. The corresponding specificity score is %f", sep="");
print(sprintf(string,
'peuple',
robespierre['peuple','D4'],
colSums(robespierre)['D4'],
rowSums(robespierre)['peuple'],
sum(robespierre),
spe['peuple', 'D4']));