PTAk {PTAk} | R Documentation |
Principal Tensor Analysis on k modes
Description
Performs a truncated SVD-kmodes analysis with or without specific metrics, penalised or not.
Usage
PTAk(X,nbPT=2,nbPT2=1,minpct=0.1,
smoothing=FALSE,
smoo=list(NA),
verbose=getOption("verbose"),file=NULL,
modesnam=NULL,addedcomment="", ...)
Arguments
X |
a tensor (as an array) of order k, if non-identity metrics are
used |
nbPT |
integer vector of length (k-2) specifying the maximum number of Principal Tensors requested for the (3,...,k-1, k) modes levels (see details), if it is not a vector every levels would have the same given nbPT value |
nbPT2 |
if 0 no 2-modes solutions will be computed, 1 =all, >1 otherwise |
minpct |
numerical 0-100 to control of computation of future solutions at this level and below |
smoothing |
|
smoo |
see |
verbose |
control printing |
file |
output printed at the prompt if |
modesnam |
character vector of the names of the modes, if |
addedcomment |
character string printed if |
... |
any other arguments passed to other functions |
Details
According to the decomposition described in Leibovici(1993) and
Leibovici and Sabatier(1998) the function gives a generalisation of
the SVD (2 modes) to k modes. The algorithm is recursive,
calling APSOLUk
which calls PTAk
for (k-1).
nbPT
, nbPT2
and minpct
control the number of
Principal Tensors desired. For example nbPT=c(2,4,3)
means a
tensor of order 5 is analysed, the maximum number of 5-modes
PT is set to 3, for each of them one sets a maximum of
4 associated 4-modes (for each of the five components),
for each of these later a maximum of 2 associated
3-modes PT is asked (for each of the four components). Then
nbPT2
complete for 2-modes associated or not. Overall
minpct
controls to carry on the algorithm at any level and
lower, i.e. stops if 100(vs^2/ssx)<minpct
(where
vs
is the singular value, and ssx is the total sum of
squares of the tensor X
or the "metric transformed" X
).
Putting a 0
at a given level in nbPT
obviously
automatically puts 0
in nbPT
at lower levels. Putting
high values in nbPT
allows control only on minpct
helping to reach the full decomposition. All these controls allow to
truncate the full decomposition in a level-controlled fashion. Notice
the full decomposition always contains any possible choice of
truncation, i.e. the solutions are not dependant on the
truncation scheme (Generalised Eckart-Young Theorem).
Recent work from Tamara G Kolda showed on an example that orthogonal rank
decompositions are not necesseraly nested. This makes PTA-kmodes a model with
nested decompositions not giving the exact orthogonal rank.
So PTA-kmodes will look for best approximation according to orthogonal tensors in a nested approximmation process.
Value
a PTAk
object which consist of a list of lists. Each mode has a list in which is listed:
$v |
matrix of components for the given mode |
$iter |
vector of iterations numbers where maximum was reach |
$test |
vector of test values at maximum |
$modesnam |
name of the mode |
$v |
matrix of components for the given mode |
The last mode list has also some additional information on the analysis done:
$d |
vector of singular values |
$pct |
percentage of sum of squares for each quared singular value |
$ssX |
vector of local sum of squares i.e. of the current tensor with the rescursive algorithm |
$vsnam |
vector of names given to the singular value according to a recursive data dependent scheme |
$datanam |
data reference |
$method |
call applied: could be PTAk or CANDPARA or PCAn or even SVDgen, with parameters choices |
$addedcomment |
the addedcomment (repeated) given in the call |
You will notice that methods other than PTAk may not have all list elements but the essential ones such as: $v, $d, $ssX, and may also have additional ones like $coremat for PCAn (the core array).
Note
The use of metrics (diagonal or not) allows flexibility of analysis like in 2 modes e.g. correspondence analysis, discriminant analysis, robust analysis. Smoothing option extending the analysis towards functional data analysis is theoretically valid for Principal Tensors belonging to a tensor product of separable Hilbert spaces (e.g. Sobolev spaces) see Leibovici and El Maach (1997).
Author(s)
Didier G. Leibovici
References
Leibovici D(1993) Facteurs <e0> Mesures R<e9>p<e9>t<e9>es et Analyses Factorielles : applications <e0> un suivi <e9>pid<e9>miologique. Universit<e9> de Montpellier II. PhD Thesis in Math<e9>matiques et Applications (Biostatistiques).
Leibovici D and El Maache H (1997) Une d<e9>composition en Valeurs Singuli<e8>res d'un <e9>l<e9>ment d'un produit Tensoriel de k espaces de Hilbert S<e9>parables. Compte Rendus de l'Acad<e9>mie des Sciences tome 325, s<e9>rie I, Statistiques (Statistics) & Probabilit<e9>s (Probability Theory): 779-782.
Leibovici D and Sabatier R (1998) A Singular Value Decomposition of a k-ways array for a Principal Component Analysis of multi-way data, the PTA-k. Linear Algebra and its Applications, 269:307-329. Kolda T.G (2003) A Counterexample to the Possibility of an Extension of the Eckart-Young Low-Rank Approximation Theorem for the Orthogonal Rank Tensor Decomposition. SIAM J. Matrix Analysis, 24(2):763-767, Jan. 2003.
Leibovici D (2008) A Simple Penalised algorithm for SVD and Multiway functional methods. (to be submitted in the futur)
Leibovici DG (2010) Spatio-temporal Multiway Decomposition using Principal Tensor Analysis on k-modes:the R package PTAk. Journal of Statistical Software, 34(10), 1-34. doi:10.18637/jss.v034.i10
See Also
REBUILD
, FCAk
, PTA3
summary.PTAk
Examples
# don <- array((1:3)%x%rnorm(6*4)%x%(1:10),c(10,4,6,3))
don <- array(1:360,c(5,4,6,3))
don <- don + rnorm(360,1,2)
dimnames(don) <- list(paste("s",1:5,sep=""),paste("T",1:4,sep=""),
paste("t",1:6,sep=""),c("young","normal","old"))
# hypothetic data on learning curve at different age and period of year
ones <-list(list(v=rep(1,5)),list(v=rep(1,4)),list(v=rep(1,6)),list(v=rep(1,3)))
don <- PROJOT(don,ones)
don.sol <- PTAk(don,nbPT=1,nbPT2=2,minpct=0.01,
verbose=TRUE,
modesnam=c("Subjects","Trimester","Time","Age"),
addedcomment="centered on each mode")
don.sol[[1]] # mode Subjects results and components
don.sol[[2]] # mode Trimester results and components
don.sol[[3]] # mode Time results and components
don.sol[[4]] # mode Age results and components with additional information on the call
summary(don.sol,testvar=2)
plot(don.sol,mod=c(1,2,3,4),nb1=1,nb2=NULL,
xlab="Subjects/Trimester/Time/Age",main="Best rank-one approx" )
plot(don.sol,mod=c(1,2,3,4),nb1=4,nb2=NULL,
xlab="Subjects/Trimester/Time/Age",main="Associated to Subject vs1111")
# demo function
# demo.PTAk()