simhash {jiebaR} | R Documentation |
Simhash computation
Description
Simhash worker uses the keyword extraction worker to find the keywords
and uses simhash algorithm to compute simhash. dict
hmm
, idf
and stop_word
should be provided when initializing
jiebaR worker.
Usage
simhash(code, jiebar)
vector_simhash(code, jiebar)
Arguments
code |
For |
jiebar |
jiebaR Worker. |
Details
There is a symbol <=
for this function.
Author(s)
Qin Wenfeng
References
MS Charikar - Similarity Estimation Techniques from Rounding Algorithms
See Also
Examples
## Not run:
### Simhash
words = "hello world"
simhasher = worker("simhash",topn=1)
simhasher <= words
distance("hello world" , "hello world!" , simhasher)
## End(Not run)
[Package jiebaR version 0.11 Index]