impute_recosystem {autostats} | R Documentation |
impute_recosystem
Description
Imputes missing values of a numeric matrix using stochastic gradient descent. recosystem
Usage
impute_recosystem(
.data,
lrate = c(0.05, 0.1),
costp_l1 = c(0, 0.05),
costq_l1 = c(0, 0.05),
costp_l2 = c(0, 0.05),
costq_l2 = c(0, 0.05),
nthread = 8,
loss = "l2",
niter = 15,
verbose = FALSE,
nfold = 4,
seed = 1
)
Arguments
.data |
long format data frame |
lrate |
learning rate |
costp_l1 |
l1 cost p |
costq_l1 |
l1 cost q |
costp_l2 |
l2 cost p |
costq_l2 |
l2 cost q |
nthread |
nthreads |
loss |
loss function. also can use "l1" |
niter |
training iterations for tune |
verbose |
show training loss? |
nfold |
folds for tune validation |
seed |
seed for randomness |
Details
input is a long data frame with 3 columns: ID col, Item col (the column names from pivoting longer), and the ratings (values from pivoting longer)
pre-processing generally requires pivoting a wide user x item matrix to long format. The missing values from the matrix must be retained as NA values in the rating column. The values will be predicted and filled in by the algorithm. Output is a long data frame with the same number of rows as input, but no missing values.
This function automatically tunes the recosystem learner before applying. Parameter values can be supplied for tuning. To avoid tuning, use single values for the parameters.
Value
long format data frame