fichiers {SARP.compo} | R Documentation |
Create and read a file of p-values for all pairwise tests of all possible ratios of a compositional vector
Description
These functions allow to perform hypothesis testing on all possible pairwise ratios or differences of a set of variables in a given data frame, and store or read their results in a file
Usage
creer.Fp( d, nom.fichier,
noms, f.p = student.fpc,
log = FALSE, en.log = !log,
nom.var = 'R',
noms.colonnes = c( "Cmp.1", "Cmp.2", "p" ),
add.col = "delta",
sep = ";", dec = ".", row.names = FALSE, col.names = TRUE,
... )
grf.Fp( nom.fichier, col.noms = c( 1, 2 ), p = 0.05, col.p = 'p',
reference = NULL, groupes = NULL,
sep = ";", dec = ".", header = TRUE,
... )
Arguments
d |
The data frame that contains the compositional variables. Other
objects will be coerced as data frames using
|
nom.fichier |
A length-one character vector giving the name of the file |
noms |
A character vector containing the column names of the compositional variables to be used for ratio computations. Names absent from the data frame will be ignored with a warning. Optionnally, an integer vector containing the column numbers can be given instead. They will be converted to column names before further processing. |
f.p |
An R function that will perform the hypothesis test on a single
ratio (or log ratio, depending on This function should return a numeric vector, of which the first one
will typically be the p-value from the test — see
Such functions are provided for several common situations, see links at the end of this manual page. |
log |
If If |
en.log |
If |
nom.var |
A length-one character vector giving the name of the
variable containing a single ratio (or log-ratio). No sanity check
is performed on it: if you experience strange behaviour, check you
gave a valid column name, for instance using
|
noms.colonnes |
A length-three character vector giving the names
of, respectively, the two columns of the data frame that will contain
the components identifiers and of the column that will contain the
p-value from the test (the first value returned by |
add.col |
A character vector giving the names of additional
columns of the data.frame, used for storing additional return values
of |
sep , dec , row.names , col.names , header |
Options for controling
the file format, used by |
col.noms |
A length-two vector giving the two columns that contain the two components of the ratio. Can be given either as column number or column name. |
col.p |
A length-one vector giving the column that contain the p-value of the ratio. Can be given either as column number or column name. |
p |
The p-value cut-off to be used when creating the
graph, see |
reference |
A character vector giving the names of nodes that should be displayed with a different color in the created graph. These names should match components names present un the file. Typical use would be for reference genes in qRT-PCR experiments. By default, all nodes are displayed in palegreen; reference nodes, if any, will be displayed in orange. |
groupes |
A list of character vectors giving set of logically related nodes, defining groups of nodes that will share common color. Currently unimplemented. |
... |
additional arguments to |
Details
These functions are basically the same as the function that
create data.frames (creer.DFp
) and use data.frames to
create a graph (grf.DFp
), except thatthey work on text
files. This allow to deal with compositionnal data including
thousands of components, like RNA-Seq or microarray data.
Seeing the results as a matrix, computations are done in rows and the file is updated after each row. Only the upper-triangular part, without the diagonal, is stored in the file.
The function that creates the graphe from file is not very efficient
and can take a lot of time for huge matrices. Making a first filter
on the file using shell tools, like gawk
or perl
, or a
dedicated C software and loading the resulting file as a data.frame
before converting it into a graph is a better alternative, but may
lose some isolated nodes.
Value
creer.Fp
does not return anything.
grf.Fp
returns the result graph.
Note
Creating a file and working from a file is quite inefficient (in terms
of speed), so for compositionnal data with only a few components,
consider using creer.DFp
that creates the data.frame
directly in memory and grf.DFp
that creates the graphe
from a data.frame instead.
Author(s)
Emmanuel Curis (emmanuel.curis@parisdescartes.fr)
See Also
Predefined f.p
functions: anva1.fpc
for one-way
analysis of variance; kw.fpc
for the non-parametric
equivalent (Kruskal-Wallis test).
For directly creating and manipulating matrices,
creer.Mp
and grf.Mp
.
Examples
# load the potery data set
data( poteries )
# Create the file name in R temporary directory
nom.fichier <- paste0( tempdir(), "/fichier_test.csv" )
nom.fichier
# Compute one-way ANOVA p-values for all ratios in this data set
# and store them in a text file
creer.Fp( poteries, nom.fichier,
c( 'Al', 'Na', 'Fe', 'Ca', 'Mg' ),
f.p = anva1.fpc, v.X = 'Site',
add.col = c( 'mu0', 'd.C', 'd.CoA', 'd.IT', 'd.L' ) )
# Make a graphe from it and plot it
plot( grf.Fp( nom.fichier ) )
# The file is a simple text-file that can be read as a data.frame
DFp <- read.table( nom.fichier, header = TRUE, sep = ";", dec = "," )
DFp