dissimilarity {recommenderlab} | R Documentation |
Dissimilarity and Similarity Calculation Between Rating Data
Description
Calculate dissimilarities/similarities between ratings by users and for items.
Usage
## S4 method for signature 'binaryRatingMatrix'
dissimilarity(x, y = NULL, method = NULL, args = NULL, which = "users")
## S4 method for signature 'realRatingMatrix'
dissimilarity(x, y = NULL, method = NULL, args = NULL, which = "users")
similarity(x, y = NULL, method = NULL, args = NULL, ...)
## S4 method for signature 'ratingMatrix'
similarity(x, y = NULL, method = NULL, args = NULL, which = "users",
min_matching = 0, min_predictive = 0)
Arguments
x |
a ratingMatrix. |
y |
|
method |
(dis)similarity measure to use. Available measures
are typically |
args |
a list of additional arguments for the methods. |
which |
a character string indicating if the (dis)similarity should be
calculated between |
min_matching , min_predictive |
Thresholds on the minimum number of ratings used to calculate the similarity and the minimum number of ratings that can be used for prediction. |
... |
further arguments. |
Details
Most dissimlarites and similarities are calculated using the proxy package.
Similarities are typically converted into dissimilarities using
or
(used for Jaccard, Cosine and Pearson correlation) depending on the measure.
Similarities are usually defined in the range of , however,
Cosine similarity and Pearson correlation are defined in the interval
. We rescale these
measures with
to the interval
.
Similarities are calculated using only the ratings that are available for both
users/items. This can lead to calculating the measure using only a very small number (maybe only one)
of ratings. min_matching
is the required number of shared ratings to calculate similarities.
To predict ratings, there need to be additional ratings in argument y
.
min_predictive
is the required number of additional ratings to calculate similarities. If
min_matching
or min_predictive
fails, then NA
is reported instead of the calculated similarity.
Value
returns an object of class "dist"
, "simil"
or an appropriate object (e.g.,
a matrix with class "crossdist"
o "crosssimil"
) to represent
a cross-(dis)similarity.
See Also
ratingMatrix
,
dissimilarity
in arules, and
dist
in proxy.
Examples
data(MSWeb)
## between 5 users
dissimilarity(MSWeb[1:5,], method = "jaccard")
similarity(MSWeb[1:5,], method = "jaccard")
## between first 3 items
dissimilarity(MSWeb[,1:3], method = "jaccard", which = "items")
similarity(MSWeb[,1:3], method = "jaccard", which = "items")
## cross-similarity between first 2 users and users 10-20
similarity(MSWeb[1:2,], MSWeb[10:20,], method="jaccard")