R: Compute the semantic similarity between two text variables.

textSimilarity {text}

R Documentation

Compute the semantic similarity between two text variables.

Description

Compute the semantic similarity between two text variables.

Usage

textSimilarity(x, y, method = "cosine", center = TRUE, scale = FALSE)

Arguments

`x`	Word embeddings from textEmbed.
`y`	Word embeddings from textEmbed.
`method`	(character) Character string describing type of measure to be computed. Default is "cosine" (see also "spearmen", "pearson" as well as measures from textDistance() (which here is computed as 1 - textDistance) including "euclidean", "maximum", "manhattan", "canberra", "binary" and "minkowski").
`center`	(boolean; from base::scale) If center is TRUE then centering is done by subtracting the column means (omitting NAs) of x from their corresponding columns, and if center is FALSE, no centering is done.
`scale`	(boolean; from base::scale) If scale is TRUE then scaling is done by dividing the (centered) columns of x by their standard deviations if center is TRUE, and the root mean square otherwise.

Value

A vector comprising semantic similarity scores. The closer the value is to 1 when using the default method, "cosine", the higher the semantic similarity.

Examples

# Compute the semantic similarity between the embeddings from "harmonytext" and "satisfactiontext".
## Not run: 
similarity_scores <- textSimilarity(
  x = word_embeddings_4$texts$harmonytext,
  y = word_embeddings_4$texts$satisfactiontext
)

# Show information about how similarity_scores were constructed.
comment(similarity_scores)

## End(Not run)

[Package text version 1.2.3 Index]

Compute the semantic similarity between two text variables.

Description

Usage

Arguments

Value

See Also

Examples