ppa {isa2} | R Documentation |
The Ping-Pong Algorithm
Description
Run the PPA with the default parameters
Usage
## S4 method for signature 'list'
ppa(data, ...)
Arguments
data |
The input, a list of two numeric matrices, with the same
number of columns. They may contain |
... |
Additional arguments, see details below. |
Details
Please read the isa2-package manual page for and introductino on ISA and PPA.
This function can be called as
ppa(data, thr.row1 = seq(1, 3, by = 0.5), thr.row2 = seq(1, 3, by = 0.5), thr.col = seq(1, 3, by = 0.5), no.seeds = 100, direction = "updown")
where the arguments are:
- data
The input, a list of two numeric matrices, with the same number of columns. They may contain
NA
and/orNaN
values, but then the algorithm might get slower, as R matrix multiplication is slower sometimes slower for these matrices, depending on your platform.- thr.row1
Numeric scalar or vector giving the threshold parameter for the rows of the first matrix. Higher values indicate a more stringent threshold and the result comodules will contain less rows for the first matrix on average. The threshold is measured by the number of standard deviations from the mean, over the values of the first row vector. If it is a vector then it must contain an entry for each seed.
- thr.row2
Numeric scalar or vector, the threshold parameter(s) for the rows of the second matrix. See
thr.row1
for details.- thr.col
Numeric scalar or vector giving the threshold parameter for the columns of both matrices. The analogue of
thr.row1
.- no.seeds
Integer scalar, the number of random seeds to use.
- direction
Character vector of length four, one for each matrix multiplication performed during a PPA iteration. It specifies whether we are interested in rows/columns that are higher (‘
up
’) than average, lower than average (‘down
’), or both (‘updown
’). The first and the second entry both corresponds to the common column dimension of the two matrices, so they should be equal, otherwise a warning is given.
The ppa
function provides and easy interface to the PPA. It
runs all sptes of a typical PPA work flow, with their default
paramers.
This involves:
Normalizing the input matrices by calling
ppa.normalize
.Generating random input seeds via
generate.seeds
.Running the PPA with all combinations of the given row1, row2 and column thresholds (by default 1, 1.5, 2, 2.5, 3); by calling
ppa.iterate
.Merging similar co-modules, separately for each threshold combination, by calling
ppa.unique
.Filtering the co-modules separately for each threshold combination, by calling
ppa.filter.robust
.Putting all co-modules from the run with different thresholds, into a single object.
Merging similar co-modules, again, but now across all threshold combinations. If two co-modules are similar, then the larger one, the one with milder thresholds is kept.
Please see the manual pages of these functions for the details.
Value
A named list is returned with the following elements:
rows1 |
The first components of the co-modules, corresponding to the rows of the first input matrix. Every column corresponds to a co-module, if an element (the score of the row) is non-zero, that means that that component is included in the co-module, otherwise it is not. Scores are between -1 and 1. If two scores have the same non-zero sign, then the corresponding first matrix rows are collelated. If they have an opposite sign, then they are anti-correlated. If an input seed did not converge within the allowed number of
iterations, then that column of |
rows2 |
This is the same as |
columns |
The same as |
seeddata |
A data frame containing information about the co-modules. There is one row for each co-module. The data frame has the following columns:
|
rundata |
A named list with information about the PPA run. It has the following entries:
|
Author(s)
Gabor Csardi Gabor.Csardi@unil.ch
References
Kutalik Z, Bergmann S, Beckmann, J: A modular approach for integrative analysis of large-scale gene-expression and drug-response data Nat Biotechnol 2008 May; 26(5) 531-9.
See Also
isa2-package for a short introduction to the ISA and the Ping-Pong algorithms. See the functions mentioned above if you want to change the default ISA parameters.
Examples
## WE do not run this, it takes relatively long
## Not run:
data <- ppa.in.silico(noise=0.1)
ppa.result <- ppa(data[1:2], direction="up")
## Find the best bicluster for each block in the input
## (based on the rows of the first input matrix)
best <- apply(cor(ppa.result$rows1, data[[3]]), 2, which.max)
## Check correlation
sapply(seq_along(best),
function(x) cor(ppa.result$rows1[,best[x]], data[[3]][,x]))
## The same for the rows of the second matrix
sapply(seq_along(best),
function(x) cor(ppa.result$rows2[,best[x]], data[[4]][,x]))
## The same for the columns
sapply(seq_along(best),
function(x) cor(ppa.result$columns[,best[x]], data[[5]][,x]))
## Plot the data and the modules found
if (interactive()) {
layout(rbind(1:2,c(3,6),c(4,7), c(5,8)))
image(data[[1]], main="In-silico data, first matrix")
image(data[[2]], main="In-silico data, second matrix")
sapply(best[1:3], function(b) image(outer(ppa.result$rows1[,b],
ppa.result$columns[,b]),
main=paste("Module", b)))
sapply(best[1:3], function(b) image(outer(ppa.result$rows2[,b],
ppa.result$columns[,b]),
main=paste("Module", b)))
}
## End(Not run)