CSanalysis,matrix,matrix,CSsmfa-method {CSFA}R Documentation



Doing interactive CS analysis with sMFA (Sparse Multiple Factor Analysis). Should use multiple queries for this analysis. Either spca or arrayspc is used.


## S4 method for signature 'matrix,matrix,CSsmfa'
CSanalysis(querMat, refMat, type = "Csmfa",
  K = 15, para, lambda = 1e-06, sparse.dim = 2, sparse = "penalty",
  max.iter = 200, eps.conv = 0.001, which = c(2, 3, 4, 5),
  component.plot = NULL, CSrank.queryplot = FALSE, column.interest = NULL,
  row.interest = NULL, profile.type = "gene", color.columns = NULL,
  gene.highlight = NULL, gene.thresP = 1, gene.thresN = -1,
  thresP.col = "blue", thresN.col = "red", grouploadings.labels = NULL,
  grouploadings.cutoff = NULL, legend.names = NULL, legend.cols = NULL,
  legend.pos = "topright", labels = TRUE, result.available = NULL,
  result.available.update = FALSE, plot.type = "device",
  basefilename = NULL)



Query matrix (Rows = genes and columns = compounds)


Reference matrix




sMFA Parameters: Number of components.


sMFA Parameters: A vector of length K. All elements should be positive. If sparse="varnum", the elements integers.


sMFA Parameters: Quadratic penalty parameter. Default value is 1e-6. If the target dimension of the sparsness is higher than the other dimension (p > n), it is advised to put lambda to Inf which uses the arrayspc algorithm optimized for this case. For the other case, p < n, a zero or positive lambda is sufficient and will utilize the normal spca algorithm.


sMFA Parameters: Which dimension should be sparse? 1: Rows, 2: Columns (default) (Note: For Connectivity Scores it is advised to apply sparsity on the compounds/columns)


sMFA Parameters (lambda < Inf only): If sparse="penalty", para is a vector of 1-norm penalty parameters. If sparse="varnum", para defines the number of sparse loadings to be obtained.


sMFA Parameters: Maximum number of iterations.


sMFA Parameters: Convergence criterion.


Choose one or more plots to draw:

  1. Information Content for Bicluster (Only available for "CSfabia")

  2. Loadings for query compounds

  3. Loadings for Component (Factor/Bicluster) component.plot

  4. Gene Scores for Component (Factor/Bicluster) component.Plot

  5. Connectivity Ranking Scores for Component component.plot

  6. Component component.plot VS Other Component : Loadings & Genes

  7. Profile plot (see profile.type)

  8. Group Loadings Plots for all components (see grouploadings.labels).


Which components (Factor/Bicluster) should be investigated? Can be a vector of multiple (e.g. c(1,3,5)). If NULL, you can choose components of interest interactively from query loadings plot.


Logical value deciding if the CS Rank Scores (which=5) should also be plotted per query (instead of only the weighted mean).


Numeric vector of indices of reference columns which should be in the profiles plots (which=7). If NULL, you can interactively select genes on the Compound Loadings plot (which=3).


Numeric vector of gene indices to be plotted in gene profiles plot (which=7, profile.type="gene"). If NULL, you can interactively select them in the gene scores plot (which=4).


Type of which=7 plot:

  • "gene": Gene profiles plot of selected genes in row.interest with the query compounds and those selected in column.interest ordered first on the x axis. The other compounds are ordered in decreasing CScore.

  • "cmpd": Compound profiles plot of query and selected compounds (column.interest) and only those genes on the x-axis which beat the thresholds (gene.thresP, gene.thresN)


Vector of colors for the query and reference columns (compounds). If NULL, blue will be used for query and black for reference. Use this option to highlight query columns and reference columns of interest.


Single numeric vector or list of maximum 5 numeric vectors. This highlights gene of interest in gene scores plot (which=4) up to 5 different colors. (e.g. You can use this to highlight genes you know to be differentially expressed)


Threshold for genes with a high score (which=4).


Threshold for genes with a low score (which=4).


Color of genes above gene.thresP.


Color of genes below gene.thresN.


This parameter used for the Group Loadings Plots (which=8). In general this plot will contain the loadings of all factors, grouped and colored by the labels given in this parameter. Two types of plot can be created:

  1. If grouploadings.labels!=NULL:
    Provide a vector for all samples (query + ref) containing labels on which the plot will be based on.

  2. If grouploadings.labels=NULL:
    If no labels are provided when choosing which=8, automatic labels ("Top Samples of Component 1, 2....") will be created. These labels are given to the top grouploadings.cutoff number of samples based on the absolute values of the loadings.

Plot which=8 can be used to check 2 different situations. The first plot checks if your provided labels coincide with the discovered structures in the analysis. The second plot aims to find new interesting structures (of samples) which strongly appear in one or multiple components. A subsequent step could be to take some strong samples/compounds of these compounds and use them as a new query set in a new CS analysis to check its validity or to find newly connected compounds.

Please note that even when group.loadings.labels!=NULL, that the labels based on the absolute loadings of all the factors (the top grouploadings.cutoff) will always be generated and saved in samplefactorlabels in the extra slot of the CSresult object. This can then later be used for the CSlabelscompare function to compare them with your true labels.


Parameter used in plot which=8. An integer for the number of cut-offs. See grouploadings.labels=NULL for more information. If this parameter is not provided, it will be automatically set to 10% of the total number of loadings.


Option to draw a legend of for example colored columns in Compound Loadings plot (which=3). If NULL, only "References" will be in the legend.


Colors to be used in legends. If NULL, only blue for "Queries is used".


Position of the legend in all requested plots, can be "topright", "topleft", "bottomleft", "bottomright", "bottom", "top", "left", "right", "center".


Boolean value (default=TRUE) to use row and/or column text labels in the score plots (which=c(3,4,5,6)).


You can a previously returned object by CSanalysis in order to only draw graphs, not recompute the scores.


Logical value. If TRUE, the CS and GS will be overwritten depending on the new component.plot choice. This would also delete the p-values if permutation.object was available.


How should the plots be outputted? "pdf" to save them in pdf files, device to draw them in a graphics device (default), sweave to use them in a sweave or knitr file.


Directory including filename of the graphs if saved in pdf files


An object of the S4 Class CSresult-class.

[Package CSFA version 1.2.0 Index]