evaluate {recommenderlab} | R Documentation |
Evaluate a Recommender Models
Description
Evaluates a single or a list of recommender model given an evaluation scheme and return evaluation metrics.
Usage
evaluate(x, method, ...)
## S4 method for signature 'evaluationScheme,character'
evaluate(x, method, type="topNList",
n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE)
## S4 method for signature 'evaluationScheme,list'
evaluate(x, method, type="topNList",
n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE)
Arguments
x |
an evaluation scheme (class |
method |
a character string or a list. If
a single character string is given it defines the recommender method
used for evaluation. If several recommender methods need to be compared,
|
type |
evaluate "topNList" or "ratings"? |
n |
a vector of the different values for N used to generate top-N lists (only if type="topNList"). |
parameter |
a list with parameters for the recommender algorithm (only
used when |
progress |
logical; report progress? |
keepModel |
logical; store used recommender models? |
... |
further arguments. |
Details
The evaluation uses the specification in the evaluation scheme to train a recommender models on training data and then evaluates the models on test data.
The result is a set of accuracy measures averaged over the test users.
See calcPredictionAccuracy
for details on the accuracy measures and the averaging.
Note: Also the confusion matrix counts are averaged over users and therefore not whole numbers.
See vignette("recommenderlab")
for more details on the evaluaiton process and the used metrics.
Value
If a single recommender method is specified in method
, then an
object of class "evaluationResults"
is returned.
If method
is a list of recommendation models, then an object of class "evaluationResultList"
is returned.
See Also
calcPredictionAccuracy
,
evaluationScheme
,
evaluationResults
.
evaluationResultList
.
Examples
### evaluate top-N list recommendations on a 0-1 data set
## Note: we sample only 100 users to make the example run faster
data("MSWeb")
MSWeb10 <- sample(MSWeb[rowCounts(MSWeb) >10,], 100)
## create an evaluation scheme (10-fold cross validation, given-3 scheme)
es <- evaluationScheme(MSWeb10, method="cross-validation",
k=10, given=3)
## run evaluation
ev <- evaluate(es, "POPULAR", n=c(1,3,5,10))
ev
## look at the results (the length of the topNList is shown as column n)
getResults(ev)
## get a confusion matrices averaged over the 10 folds
avg(ev)
plot(ev, annotate = TRUE)
## evaluate several algorithms (including a hybrid recommender) with a list
algorithms <- list(
RANDOM = list(name = "RANDOM", param = NULL),
POPULAR = list(name = "POPULAR", param = NULL),
HYBRID = list(name = "HYBRID", param =
list(recommenders = list(
RANDOM = list(name = "RANDOM", param = NULL),
POPULAR = list(name = "POPULAR", param = NULL)
)
)
)
)
evlist <- evaluate(es, algorithms, n=c(1,3,5,10))
evlist
names(evlist)
## select the first results by index
evlist[[1]]
avg(evlist[[1]])
plot(evlist, legend="topright")
### Evaluate using a data set with real-valued ratings
## Note: we sample only 100 users to make the example run faster
data("Jester5k")
es <- evaluationScheme(Jester5k[1:100], method="split",
train=.9, given=10, goodRating=5)
## Note: goodRating is used to determine positive ratings
## predict top-N recommendation lists
## (results in TPR/FPR and precision/recall)
ev <- evaluate(es, "RANDOM", type="topNList", n=10)
getResults(ev)
## predict missing ratings
## (results in RMSE, MSE and MAE)
ev <- evaluate(es, "RANDOM", type="ratings")
getResults(ev)