R: Collective Matrix Factorization

CMF {CMF}

R Documentation

Collective Matrix Factorization

Description

Learns the CMF model for a given collection of M matrices. The code learns the parameters of a variational approximation for CMF, and also computes predictions for indices specified in test.

Usage

CMF(X, inds, K, likelihood, D, test = NULL, opts = NULL)

Arguments

`X`	List of input matrices.
`inds`	A `length(X)` times 2 matrix that links dimensions of the matrices in `X` to object sets. `inds[m, 1]` tells which object set corresponds to the rows in matrix `X[[m]]`, and `inds[m, 2]` tells the same for the columns.
`K`	The number of factors.
`likelihood`	A list of likelihood choices, one for each matrix in X. Each entry should be a string with possible values of: "gaussian", "bernoulli" or "poisson".
`D`	A vector containing sizes of each object set.
`test`	A list of test matrices. If not NULL, the code will compute predictions for these elements of the matrices. This duplicates the functionality of `predictCMF()`.
`opts`	A list of options as given by `getCMFopts()`. If set to `NULL`, the default values will be used.

Details

The variational approximation is fully factorized over all of the model parameters, including individual elements of the projection matrices. The parameters for the projection matrices are updated jointly by Newton-Raphson method, whereas the rest use closed-form updates.

Note that the input data needs to be given in a specific sparse format. See matrix_to_triplets() for details.

The behavior of the algorithm can be modified via the opts parameter. See getCMFopts() for details. Of particular interest are the elements useBias and method.

For full description of the output parameters, see the referred publication. The notation in the code follows roughly the notation used in the paper.

Value

A list of

`U`	A list of the mean parameters for the rank-K projection matrices, one for each object set.
`covU`	A list of the variance parameters for the rank-K projection matrices, one for each object set.
`tau`	A vector of the precision parameter means.
`alpha`	A vector of the ARD parameter means.
`cost`	A vector of variational lower bound values.
`inds`	The input parameter `inds` stored for further use.
`errors`	A vector containing root-mean-square errors for each iteration, computed over the elements indicated by the `test` parameter.
`bias`	A list (of lists) storing the parameters of the row and column bias terms.
`D`	The sizes of the object sets as given in the parameters.
`K`	The number of components as given in the parameters.
`Uall`	Matrices of U joined into one sum(D) by K matrix, for easier plotting of the results.
`items`	A list containing the running number for each item among all object sets. This corresponds to rows of the `Uall` matrix. Each part of the list contains a vector that has the numbers for each particular object set.
`out`	If test matrices were provided, returns the reconstructed data sets. Otherwise returns `NULL`.
`M`	The number of input matrices.
`likelihood`	The likelihoods of the matrices.
`opts`	The options used for running the code.

Author(s)

Arto Klami and Lauri Väre

References

Arto Klami, Guillaume Bouchard, and Abhishek Tripathi. Group-sparse embeddings in collective matrix factorization. arXiv:1312.5921, 2014.

Examples

# See CMF-package for an example.

[Package CMF version 1.0.3 Index]