as.textmatrix {lsa} | R Documentation |
Display a latent semantic space generated by Latent Semantic Analysis (LSA)
Description
Returns a latent semantic space (created by createLSAspace) in textmatrix format: rows are terms, columns are documents.
Usage
as.textmatrix( LSAspace )
Arguments
LSAspace |
a latent semantic space generated by createLSAspace. |
Details
To allow comparisons between terms and documents, the internal
format of the latent semantic space needs to be converted to
a classical document-term matrix (just like the ones generated by
textmatrix()
that are of class ‘textmatrix’).
Remark: There are other ways to compare documents and terms using the partial matrices from an LSA space directly. See (Berry, 1995) for more information.
Value
textmatrix |
a textmatrix representation of the latent semantic space. |
Author(s)
Fridolin Wild f.wild@open.ac.uk
References
Berry, M., Dumais, S., and O'Brien, G (1995) Using Linear Algebra for Intelligent Information Retrieval. In: SIAM Review, Vol. 37(4), pp.573–595.
See Also
Examples
# create some files
td = tempfile()
dir.create(td)
write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/"))
write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/"))
write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/"))
write( c("dog", "mouse", "dog"), file=paste(td, "D4", sep="/"))
# read files into a document-term matrix
myMatrix = textmatrix(td, minWordLength=1)
# create the latent semantic space
myLSAspace = lsa(myMatrix, dims=dimcalc_raw())
# display it as a textmatrix again
round(as.textmatrix(myLSAspace),2) # should give the original
# create the latent semantic space
myLSAspace = lsa(myMatrix, dims=dimcalc_share())
# display it as a textmatrix again
myNewMatrix = as.textmatrix(myLSAspace)
myNewMatrix # should look be different!
# compare two terms with the cosine measure
cosine(myNewMatrix["dog",], myNewMatrix["cat",])
# compare two documents with pearson
cor(myNewMatrix[,1], myNewMatrix[,2], method="pearson")
# clean up
unlink(td, recursive=TRUE)