pfcoMod {fcros} | R Documentation |
Searching for differentially expressed genes or detecting recurrent copy number aberration probes using an approach based on the Perron-Frobenius theorem
Description
Implementation of a method based on fold change rank and the Perron theorem to search for differentially expressed genes or to detect chromosomal recurrent copy number aberration probes. This function should be used with a matrix of fold changes or ratios from biological dataset (microarray, RNA-seq, ...). The function pfcoMod() is an extention of the function pfco() to a dataset which does not contain replicate samples or to a dataset with one biological condition dataset. Statistics are associated with genes/probes to characterize their change levels.
Usage
pfcoMod(fcMat, samp, log2.opt = 0, trim.opt = 0.25)
Arguments
fcMat |
A matrix containing fold changes or ratios from a biological dataset to process for searching differentially expressed genes or for detecting recurrent copy number aberrations regions. The rownames of fcMat are used for the output idnames. |
samp |
A vector of sample label names which should appear in the columns
of the matrix fcMat: |
log2.opt |
A scalar equals to 0 or 1. The value 0 (default) means that
values in the matrix "fcMat" are expressed in a log2 scale:
|
trim.opt |
A scalar between 0 and 0.5. The value 0.25 (default) means
that 25% of the lower and the upper rank values for each gene
are not used for computing the statistic "ri", i.e. the
interquartile range rank values are averaged:
|
Details
The label names appearing in the parameter "samp" should match some label names of the columns in the data matrix "xdata". It is not necessary to use all label names appearing in the columns of the dataset matrix.
Value
This function returns a data frame containing 8 components
idnames |
A vector containing the list of IDs or symbols associated with genes |
ri |
The average of ordered rank values associated with genes in the dataset. These values are rank statistics leading to the f-values and the p-values. |
FC2 |
The robust fold changes for genes in matrix "fcMat". These fold changes are calculated as a trimed mean of the values in "fcMat". Non log scale values are used in this calculation. |
f.value |
The f-values are probabilities associated with genes using the "mean" and the "standard deviation" ("sd") of values in "ri". The "mean" and "sd" are used as a normal distribution parameters. |
p.value |
The p-values associated with genes. The p-values are obtained from the fold change ranks using a one sample t-test. |
comp |
Singular values. |
comp.w |
Singular values weights. |
comp.wcum |
Cumulative sum of the singular values weights. |
Author(s)
Doulaye Dembele doulaye@igbmc.fr
References
Dembele D, Analysis of high biological data using their rank values, Stat Methods Med Res, accepted for publication, 2018
Examples
data(fdata);
rownames(fdata) <- fdata[,1];
cont <- c("cont01", "cont07", "cont03", "cont04", "cont08");
test <- c("test01", "test02", "test08", "test09", "test05");
log2.opt <- 0;
trim.opt <- 0.25;
# perform pfcoMod()
fc <- fcrosFCmat(fdata, cont, test, log2.opt, trim.opt);
m <- ncol(fc$fcMat)
samp <- paste("Col",as.character(1:m), sep = "");
fc.val <- cbind(data.frame(fc$fcMat))
colnames(fc.val) <- samp
rownames(fc.val) <- fdata[,1]
af <- pfcoMod(fc.val, samp, log2.opt, trim.opt);
# now select top 20 down and/or up regulated genes
top20 <- fcrosTopN(af, 20);
alpha1 <- top20$alpha[1];
alpha2 <- top20$alpha[2];
id.down <- matrix(0,1);
id.up <- matrix(0,1);
n <- length(af$FC);
f.value <- af$f.value;
idown <- 1;
iup <- 1;
for (i in 1:n) {
if (f.value[i] <= alpha1) { id.down[idown] <- i; idown <- idown + 1; }
if (f.value[i] >= alpha2) { id.up[iup] <- i; iup <- iup + 1; }
}
data.down <- fdata[id.down[1:(idown-1)], ];
ndown <- nrow(data.down);
data.up <- fdata[id.up[1:(iup-1)], ];
nup <- nrow(data.up);
# now plot down regulated genes
t <- 1:20;
op = par(mfrow = c(2,1));
plot(t, data.down[1, 2:21], type = "l", col = "blue", xlim = c(1,20),
ylim = c(0,18), main = "Top down-regulated genes");
for (i in 2:ndown) {
lines(t,data.down[i, 2:21], type = "l", col = "blue")
}
# now plot down and up regulated genes
plot(t, data.up[1,2:21], type = "l", col = "red", xlim = c(1,20),
ylim = c(0,18), main = "Top up-regulated genes");
for (i in 2:nup) {
lines(t, data.up[i,2:21], type = "l", col = "red")
}
par(op)