Jester5k {recommenderlab} | R Documentation |
Jester dataset (5k sample)
Description
The data set contains a sample of 5000 users from the anonymous ratings data from the Jester Online Joke Recommender System collected between April 1999 and May 2003.
Usage
data(Jester5k)
Format
The format of Jester5k
is: Formal class 'realRatingMatrix' [package "recommenderlab"]
The format of JesterJokes
is: vector of character strings.
Details
Jester5k
contains a 5000 x 100 rating matrix (5000 users and 100 jokes)
with ratings between -10.00 and +10.00. All selected users have
rated 36 or more jokes.
The data also contains the actual jokes in JesterJokes
.
References
Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. "Eigentaste: A Constant Time Collaborative Filtering Algorithm." Information Retrieval, 4(2), 133-151. July 2001.
Examples
data(Jester5k)
Jester5k
## number of ratings
nratings(Jester5k)
## number of ratings per user
summary(rowCounts(Jester5k))
## rating distribution
hist(getRatings(Jester5k), main="Distribution of ratings")
## 'best' joke with highest average rating
best <- which.max(colMeans(Jester5k))
cat(JesterJokes[best])
[Package recommenderlab version 1.0.6 Index]