RN_select {RNentropy} | R Documentation |
Select transcripts/genes with significant p-values.
Description
Select transcripts with global p-value lower than an user defined threshold and provide a summary of over- or under-expression according to local p-values.
Usage
RN_select(Results, gpv_t = 0.01, lpv_t = 0.01, method = "BH")
Arguments
Results |
The output of RNentropy or RN_calc. |
gpv_t |
Threshold for global p-value. (Default: 0.01) |
lpv_t |
Threshold for local p-value. (Default: 0.01) |
method |
Multiple test correction method. Available methods are the ones of p.adjust. Type p.adjust.methods to see the list. Default: BH (Benjamini & Hochberg) |
Value
The original input containing
gpv |
-log10 of the global p-values |
lpv |
-log10 of the local p-values |
c_like |
results formatted as in the output of the C++ implementation of RNentropy. |
res |
The results data.frame containing the original expression values together with the -log10 of global and local p-values. |
design |
The experimental design matrix. |
and a new dataframe
selected |
Transcripts/genes with a corrected global p-value lower than gpv_t. For each condition it will contain a column where values can be -1,0,1 or NA. 1 means that all the replicates of this condition have expression value higher than the average and local p-value <= lpv_t (thus the corresponding gene will be over-expressed in this condition). -1 means that all the replicates of this condition have expression value lower than the average and local p-value <= lpv_t (thus the corresponding gene will be under-expressed in this condition). 0 means that at least one of the replicates has a local p-value > lpv_t. NA means that the local p-values of the replicates are not consistent for this condition, that is, at least one replicate results to be over-expressed and at least one results to be under-expressed. |
Author(s)
Giulio Pavesi - Dep. of Biosciences, University of Milan
Federico Zambelli - Dep. of Biosciences, University of Milan
Examples
data("RN_Brain_Example_tpm", "RN_Brain_Example_design")
#compute statistics and p-values (considering only a subset of genes due to
#examples running time limit of CRAN)
Results <- RN_calc(RN_Brain_Example_tpm[1:10000,], RN_Brain_Example_design)
Results <- RN_select(Results)
## The function is currently defined as
function (Results, gpv_t = 0.01, lpv_t = 0.01, method = "BH")
{
lpv_t <- -log10(lpv_t)
gpv_t <- -log10(gpv_t)
Results$gpv_bh <- -log10(p.adjust(10^-Results$gpv, method = method))
true_rows <- (Results$gpv_bh >= gpv_t)
design_b <- t(Results$design > 0)
Results$lpv_sel <- data.frame(row.names = rownames(Results$lpv)[true_rows])
for (d in seq_along(design_b[, 1])) {
col <- apply(Results$lpv[true_rows, ], 1, ".RN_select_lpv_row",
design_b[d, ], lpv_t)
Results$lpv_sel <- cbind(Results$lpv_sel, col)
colnames(Results$lpv_sel)[length(Results$lpv_sel)] <- paste("condition",
d, sep = "_")
}
lbl <- Results$res[, !sapply(Results$res, is.numeric)]
Results$selected <- cbind(lbl[true_rows], Results$gpv[true_rows],
Results$gpv_bh[true_rows], Results$lpv_sel)
colnames(Results$selected) <- c(names(which(!sapply(Results$res,
is.numeric))), "GL_LPV", "Corr. GL_LPV", colnames(Results$lpv_sel))
Results$selected <- Results$selected[order(Results$selected[,3], decreasing=TRUE),]
Results$lpv_sel <- NULL
return(Results)
}