cosine {lsa}R Documentation

Cosine Measure (Matrices)

Description

Calculates the cosine measure between two vectors or between all column vectors of a matrix.

Usage

cosine(x, y = NULL)

Arguments

x

A vector or a matrix (e.g., a document-term matrix).

y

Optional: a vector with compatible dimensions to x. If ‘NULL’, all column vectors of x are correlated.

Details

cosine() calculates a similarity matrix between all column vectors of a matrix x. This matrix might be a document-term matrix, so columns would be expected to be documents and rows to be terms.

When executed on two vectors x and y, cosine() calculates the cosine similarity between them.

Value

Returns a nnn*n similarity matrix of cosine values, comparing all nn column vectors against each other. Executed on two vectors, their cosine similarity value is returned.

Note

The cosine measure is nearly identical with the pearson correlation coefficient (besides a constant factor) cor(method="pearson"). For an investigation on the differences in the context of textmining see (Leydesdorff, 2005).

Author(s)

Fridolin Wild f.wild@open.ac.uk

References

Leydesdorff, L. (2005) Similarity Measures, Author Cocitation Analysis,and Information Theory. In: JASIST 56(7), pp.769-772.

See Also

cor

Examples


## the cosinus measure between two vectors

vec1 = c( 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
vec2 = c( 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0 )
cosine(vec1,vec2) 


## the cosine measure for all document vectors of a matrix

vec3 = c( 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0 )
matrix = cbind(vec1,vec2, vec3)
cosine(matrix)



[Package lsa version 0.73.3 Index]