gssvd {gsrs}R Documentation

Train the group-specific model and test model performance

Description

This gssvd() function uses ratings dataset to train a group-specific recommender system, tests the performance, and output the key matrix for prediction. To make the training process run in parallel, doParallel package is recommended to use. For more details regarding how the simulated dataset created, please refer to http://dx.doi.org/10.1080/01621459.2016.1219261.

Usage

gssvd(
  train,
  test,
  B = 10,
  C = 10,
  K,
  tol_1 = 0.001,
  tol_2 = 1e-05,
  lambda = 2,
  max_iter = 100,
  verbose = 0,
  user_group = NULL,
  item_group = NULL
)

Arguments

train

Train set, a matrix with three columns (userID, movieID, ratings)

test

Test set, a matrix with three columns (userID, movieID, ratings)

B

Number of user groups, 10 by default, don't need to specify if user_group prarmeter is not NULL

C

Number of item groups, 10 by default, don't need to specify if item_group prarmeter is not NULL

K

Number of latent factors

tol_1

The stopping criterion for outer loop in the proposed algorithm, 1e-3 by default

tol_2

The stopping criterion for sub-loops, 1e-5 by default

lambda

Value of penalty term in ridge regression for ALS, 2 by default

max_iter

Maximum number of iterations in the training process, 100 by default

verbose

Boolean, if print out the detailed intermediate computations in the training process, 0 by default

user_group

Optional parameter, should be a n-dim vector, n is total number of users, each element in the vector represents the group ID for that user (We will use missing pattern if not specified)

item_group

Optional parameter, should be a m-dim vector, m is total number of items, each element in the vector represents the group ID for that item (We will use missing pattern if not specified)

Value

Return the list of result, including matrix P, Q, S, T and RMSE of test set (RMSE_Test)

Author(s)

Yifei Zhang, Xuan Bi

References

Xuan Bi, Annie Qu, Junhui Wang & Xiaotong Shen A Group-Specific Recommender System, Journal of the American Statistical Association, 112:519, 1344-1353 DOI: 10.1080/01621459.2016.1219261. Please contact the author should you encounter any problems A fast version written in Matlab is available at https://sites.google.com/site/xuanbigts/software.

Examples

## Training model on the simulated data file
library(doParallel)
registerDoParallel(cores=2)
# CRAN limits the number of cores available to packages to 2,
# you can use cores = detectCores()-1 in the real work setting.
getDoParWorkers()
example_data_path = system.file("extdata", "sim_data.txt", package="gsrs")
ratings = read.table(example_data_path, sep =":", header = FALSE)[1:100,]
# Initialization Parameters
K=3
B=10
C=10
lambda = 2
max_iter = 1 # usually more than 10;
tol_1=1e-1
tol_2=1e-1
# Train Test Split
N=dim(ratings)[1]
test_rate = 0.3
train.row=which(rank(ratings[, 1]) <= floor((1 - test_rate) * N))
test.row=which(rank(ratings[, 1]) > floor((1 - test_rate) * N))
train.data=ratings[train.row,1:3]
test.data=ratings[test.row,1:3]
# Call gssvd function
a = gssvd(train=train.data, test=test.data, B=B, C=C, K=K,
lambda=lambda, max_iter=max_iter, verbose=1)
stopImplicitCluster()
# Output the result
a$RMSE_Test
head(a$P)
head(a$Q)
head(a$S)
head(a$T)

[Package gsrs version 0.1.1 Index]