tensorBF {tensorBF} | R Documentation |
Bayesian Factorization of a Tensor
Description
tensorBF
implements the Bayesian factorization of a tensor.
Usage
tensorBF(Y, method = "CP", K = NULL, opts = NULL,
fiberCentering = NULL, slabScaling = NULL, noiseProp = c(0.5, 0.5))
Arguments
Y |
is a three-mode tensor to be factorized. |
method |
the factorization method. Currently only "CP" (default) is supported. |
K |
The number of components (i.e. latent variables or factors). Recommended to be set somewhat higher than the expected component number, so that the method can determine the model complexity by prunning excessive components (default: 20% of the sum of lower two dimensions). High values result in high CPU time. NOTE: Adjust parameter noiseProp if sufficiently large values of K do not lead to a model with pruned components. |
opts |
List of model options; see function |
fiberCentering |
the mode for which fibers are to be centered at zero (default = NULL). Fiber is analogous to a vector in a particular mode. Fiber centering and Slab scaling are the recommended normalizations for a tensor. For details see the provided normalization functions and the references therein. |
slabScaling |
the mode for which slabs are to be scaled to unit variance (default = NULL). Slab is analogous to a matrix in a particular mode. Alternativly, you can preprocess the data using the provided normalization functions. |
noiseProp |
c(prop,conf); sets an informative noise prior for tensorBF.
The model sets the noise prior such that the expected proportion of
variance explained by noise is defined by this parameter. It is recommended when
the standard prior from - prop defines the proportion of total variance to be explained by noise (between 0.1 and 0.9), - conf defines the confidence in the prior (between 0.1 and 10). We suggest a default value of c(0.5,0.5) for real data sets. |
Details
Bayesian Tensor Factorization performs tri-linear (CP) factorization of a tensor.
The method automatically identifies the number of components,
given K is initialized to a large enough value, see arguments.
Missing values are supported and should be set as NA's in the data.
They will not affect the model parameters, and can be predicted
with function predictTensorBF
, based on the observed values.
Value
A list containing model parameters. For key parameters, the final posterior sample ordered w.r.t. component variance is provided to aid in initial checks; all the posterior samples should be used for model analysis. The list elements are:
K |
The number of learned components. If this value is not less then the input argument K, the model should be rerun with a larger K or use the noiseProp parameter. |
X |
a matrix of |
W |
a matrix of |
U |
a matrix of |
tau |
The last sample of noise precision. |
and the following elements:
posterior |
the posterior samples of model parameters (X,U,W,Z,tau). |
cost |
The likelihood of all the posterior samples. |
opts |
The options used to run the model. |
conv |
An estimate of the convergence of the model, based on reconstruction
of data using the Geweke diagnostic. Values significantly above 0.05 occur when
model has not converged and should therefore be rerun with a higher value of iter.burnin in |
pre |
A list of centering and scaling values used to transform the data, if any. Else an empty list. |
Examples
#Data generation
K <- 2
X <- matrix(rnorm(20*K),20,K)
W <- matrix(rnorm(25*K),25,K)
U <- matrix(rnorm(3*K),3,K)
Y = 0
for(k in 1:K) Y <- Y + outer(outer(X[,k],W[,k]),U[,k])
Y <- Y + array(rnorm(20*25*3,0,0.25),dim=c(20,25,3))
#Run the method with default options
## Not run: res2 <- tensorBF(Y=Y)
#Run the method with K=3 and iterations=1000
## Not run: opts <- getDefaultOpts(); opts$iter.burnin = 1000
## Not run: res1 <- tensorBF(Y=Y,K=3,opts=opts)
#Vary the user defined expected proportion of noise variance
#explained. c(0.2,1) represents 0.2 as the noise proportion
#and confidence of 1
## Not run: res3 <- tensorBF(Y=Y,noiseProp=c(0.2,1))