R: Searching for differentially expressed genes or detecting...

pfcoMod {fcros}

R Documentation

Searching for differentially expressed genes or detecting recurrent copy number aberration probes using an approach based on the Perron-Frobenius theorem

Description

Implementation of a method based on fold change rank and the Perron theorem to search for differentially expressed genes or to detect chromosomal recurrent copy number aberration probes. This function should be used with a matrix of fold changes or ratios from biological dataset (microarray, RNA-seq, ...). The function pfcoMod() is an extention of the function pfco() to a dataset which does not contain replicate samples or to a dataset with one biological condition dataset. Statistics are associated with genes/probes to characterize their change levels.

Usage

pfcoMod(fcMat, samp, log2.opt = 0, trim.opt = 0.25)

Arguments

`fcMat`	A matrix containing fold changes or ratios from a biological dataset to process for searching differentially expressed genes or for detecting recurrent copy number aberrations regions. The rownames of fcMat are used for the output idnames.
`samp`	A vector of sample label names which should appear in the columns of the matrix fcMat: `samp`.
`log2.opt`	A scalar equals to 0 or 1. The value 0 (default) means that values in the matrix "fcMat" are expressed in a log2 scale: `log2.opt` = 0
`trim.opt`	A scalar between 0 and 0.5. The value 0.25 (default) means that 25% of the lower and the upper rank values for each gene are not used for computing the statistic "ri", i.e. the interquartile range rank values are averaged: `trim.opt` = 0.25

Details

The label names appearing in the parameter "samp" should match some label names of the columns in the data matrix "xdata". It is not necessary to use all label names appearing in the columns of the dataset matrix.

Value

This function returns a data frame containing 8 components

`idnames`	A vector containing the list of IDs or symbols associated with genes
`ri`	The average of ordered rank values associated with genes in the dataset. These values are rank statistics leading to the f-values and the p-values.
`FC2`	The robust fold changes for genes in matrix "fcMat". These fold changes are calculated as a trimed mean of the values in "fcMat". Non log scale values are used in this calculation.
`f.value`	The f-values are probabilities associated with genes using the "mean" and the "standard deviation" ("sd") of values in "ri". The "mean" and "sd" are used as a normal distribution parameters.
`p.value`	The p-values associated with genes. The p-values are obtained from the fold change ranks using a one sample t-test.
`comp`	Singular values.
`comp.w`	Singular values weights.
`comp.wcum`	Cumulative sum of the singular values weights.

Author(s)

Doulaye Dembele doulaye@igbmc.fr

References

Dembele D, Analysis of high biological data using their rank values, Stat Methods Med Res, accepted for publication, 2018

Examples

   data(fdata);
   rownames(fdata) <- fdata[,1];

   cont <- c("cont01", "cont07", "cont03", "cont04", "cont08");
   test <- c("test01", "test02", "test08", "test09", "test05");
   log2.opt <- 0;
   trim.opt <- 0.25;

   # perform pfcoMod()
   fc <- fcrosFCmat(fdata, cont, test, log2.opt, trim.opt);
   m <- ncol(fc$fcMat)
   samp <- paste("Col",as.character(1:m), sep = "");
   fc.val <- cbind(data.frame(fc$fcMat))
   colnames(fc.val) <- samp
   rownames(fc.val) <- fdata[,1]

   af <- pfcoMod(fc.val, samp, log2.opt, trim.opt);

   # now select top 20 down and/or up regulated genes
   top20 <- fcrosTopN(af, 20);
   alpha1 <- top20$alpha[1];
   alpha2 <- top20$alpha[2];
   id.down  <- matrix(0,1);
   id.up <- matrix(0,1);
   n <- length(af$FC);
   f.value <- af$f.value;

   idown <- 1;
   iup <- 1;
   for (i in 1:n) {
       if (f.value[i] <= alpha1) { id.down[idown] <- i; idown <- idown + 1; }
       if (f.value[i] >= alpha2) { id.up[iup] <- i; iup <- iup + 1; }
   }

   data.down <- fdata[id.down[1:(idown-1)], ];
   ndown <- nrow(data.down);
   data.up <- fdata[id.up[1:(iup-1)], ];
   nup <- nrow(data.up);


   # now plot down regulated genes
   t <- 1:20;
   op = par(mfrow = c(2,1));
   plot(t, data.down[1, 2:21], type = "l", col = "blue", xlim = c(1,20),
        ylim = c(0,18), main = "Top down-regulated genes");
   for (i in 2:ndown) {
       lines(t,data.down[i, 2:21], type = "l", col = "blue")
   }

   # now plot down and up regulated genes
   plot(t, data.up[1,2:21], type = "l", col = "red", xlim = c(1,20), 
       ylim = c(0,18), main = "Top up-regulated genes");
   for (i in 2:nup) {
       lines(t, data.up[i,2:21], type = "l", col = "red")
   }
   par(op)

[Package fcros version 1.6.1 Index]