cosine {lsa}R Documentation

Cosine Measure (Matrices)

Description

Calculates the cosine measure between two vectors or between all column vectors of a matrix.

Usage

cosine(x, y = NULL)

Arguments

x

A vector or a matrix (e.g., a document-term matrix).

y

Optional: a vector with compatible dimensions to x. If ‘NULL’, all column vectors of x are correlated.

Details

cosine() calculates a similarity matrix between all column vectors of a matrix x. This matrix might be a document-term matrix, so columns would be expected to be documents and rows to be terms.

When executed on two vectors x and y, cosine() calculates the cosine similarity between them.

Value

Returns a n*n similarity matrix of cosine values, comparing all n column vectors against each other. Executed on two vectors, their cosine similarity value is returned.

Note

The cosine measure is nearly identical with the pearson correlation coefficient (besides a constant factor) cor(method="pearson"). For an investigation on the differences in the context of textmining see (Leydesdorff, 2005).

Author(s)

Fridolin Wild f.wild@open.ac.uk

References

Leydesdorff, L. (2005) Similarity Measures, Author Cocitation Analysis,and Information Theory. In: JASIST 56(7), pp.769-772.

See Also

cor

Examples


## the cosinus measure between two vectors

vec1 = c( 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
vec2 = c( 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0 )
cosine(vec1,vec2) 


## the cosine measure for all document vectors of a matrix

vec3 = c( 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0 )
matrix = cbind(vec1,vec2, vec3)
cosine(matrix)



[Package lsa version 0.73.3 Index]