R: Finding Neighbor Users And Their Similarity Values

simple_similarity {CFF}

R Documentation

Finding Neighbor Users And Their Similarity Values

Description

Steps of calculating the similarity of one user to an active user :

1- Calculating the difference between the desired user ratings with the active user in common items.

2- Calculating the similarity value for each common item.

3- Calculating the mean value of similarities.

Usage

simple_similarity(ratings, max_score=5, min_score=1, ac)

Arguments

`ratings`	A rating matrix whose rows are items and columns are users.
`max_score`	The maximum range of ratings.
`min_score`	The minimum range of ratings.
`ac`	The id of an active user as an integer (`1\le ac \le length of users`).

Details

The similarity of the active user with other users is obtained by the following formulas :

dif_{(u_i, j)}=|r_{(u_a, j)}-r_{(u_i, j)}|

sim_{dif_{(u_i, j)}}=\frac{-dif_{(u_i, j)}}{max_score-min_score}+1

sim_{(u_a, u_j)}=\frac{\sum_{j=1}^{N_j}sim_{(dif_{(u_i,j)})}}{N_j}

j is the row number for the items and i is the column number for the users in the ratings matrix.

u_i is a ith column user and u_a is an active user.

r_{(u_a, j)} is the rating of active user in the jth row and r_{(u_i, j)} is the rating of the ith user in the jth row.

dif_{(u_i, j)} is the difference of the rating for the ith user with the active user in the jth row.

sim_{dif_{(u_i, j)}} is the similarity of the ith user with the active user in the jth row.

sim_{(u_a, u_i)} is the similarity of the user i, with the active user.

N_j is the number of common items.

For example, suppose active user ratings are: {2, nan, 3, nan, 5} and one user ratings are: {3, 4, nan, nan, 1} then for ratings between 1 and 5:

dif={1, nan, nan, nan, 4} and

sim(dif)={\frac{-1}{5-1}+1, nan, nan, nan, \frac{-4}{5-1}+1}={0.75, nan, nan, nan, 0}

and mean of sim(dif) is sim=0.375.

Value

An object of class "simple_similarity", a list with components:

`call`	The call used.
`sim_x`	Neighboring user similarity values in descending order.
`sim_index`	Number of columns for neighboring users in descending order of similarity.

Author(s)

Farimah Houshmand Nanehkaran

Maintainer: Farimah Houshmand Nanehkaran <hoshmandcomputer@gmail.com>

References

Mongia, A., & Majumdar, A. (2019). Matrix completion on multiple graphs: Application in collaborative filtering. Signal Processing, vol. 165, pp. 144-148.

Hong, B., & Yu, M. (2019). A collaborative filtering algorithm based on correlation coefficient. Neural Computing and Applications, vol. 31, no. 12, pp. 8317-8326.

Examples

ratings <- matrix(c(  2,    5,  NaN,  NaN,  NaN,    4,
                    NaN,  NaN,  NaN,    1,  NaN,    5,
                    NaN,    4,    5,  NaN,    4,  NaN,
                      4,  NaN,  NaN,    5,  NaN,  NaN,
                      5,  NaN,    2,  NaN,  NaN,  NaN,
                    NaN,    1,  NaN,    4,    2,  NaN),nrow=6,byrow=TRUE)#items*users

sim <- simple_similarity(ratings, max_score=5, min_score=1, ac=1)

[Package CFF version 1.0 Index]