geneSelection {Cascade} | R Documentation |
Methods for selecting genes
Description
Selection of differentially expressed genes.
Usage
## S4 method for signature 'micro_array,micro_array,numeric'
geneSelection(
x,
y,
tot.number,
data_log = TRUE,
wanted.patterns = NULL,
forbidden.patterns = NULL,
peak = NULL,
alpha = 0.05,
Design = NULL,
lfc = 0
)
## S4 method for signature 'list,list,numeric'
geneSelection(
x,
y,
tot.number,
data_log = TRUE,
alpha = 0.05,
cont = FALSE,
lfc = 0,
f.asso = NULL
)
## S4 method for signature 'micro_array,numeric'
genePeakSelection(
x,
peak,
y = NULL,
data_log = TRUE,
durPeak = c(1, 1),
abs_val = TRUE,
alpha_diff = 0.05
)
Arguments
x |
either a micro_array object or a list of micro_array objects. In the first case, the micro_array object represents the stimulated measurements. In the second case, the control unstimulated data (if present) should be the first element of the list. |
y |
either a micro_array object or a list of strings. In the first case, the micro_array object represents the stimulated measurements. In the second case, the list is the way to specify the contrast:
|
tot.number |
an integer. The number of selected genes. If tot.number <0 all differentially genes are selected. If tot.number > 1, tot.number is the maximum of diffenrtially genes that will be selected. If 0<tot.number<1, tot.number represents the proportion of diffenrentially genes that are selected. |
data_log |
logical (default to TRUE); should data be logged ? |
wanted.patterns |
a matrix with wanted patterns [only for geneSelection]. |
forbidden.patterns |
a matrix with forbidden patterns [only for geneSelection]. |
peak |
interger. At which time points measurements should the genes be selected [optionnal for geneSelection]. |
alpha |
float; the risk level. Default to 'alpha=0.05' |
Design |
the design matrix of the experiment. Defaults to 'NULL'. |
lfc |
log fold change value used in limma's 'topTable'. Defaults to 0. |
cont |
use contrasts. Defaults to 'FALSE'. |
f.asso |
function used to assess the association between the genes. The default value 'NULL' implies the use of the usual 'mean' function. |
durPeak |
vector of size 2 (default to c(1,1)) ; the first elements gives the length of the peak at the left, the second at the right. [only for genePeakSelection] |
abs_val |
logical (default to TRUE) ; should genes be selected on the basis of their absolute value expression ? [only for genePeakSelection] |
alpha_diff |
float; the risk level |
Value
A micro_array object.
Author(s)
Nicolas Jung, Frédéric Bertrand , Myriam Maumy-Bertrand.
References
Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014). Cascade: a R-package to study, predict and simulate the diffusion of a signal through a temporal gene network. Bioinformatics, btt705.
Vallat, L., Kemper, C. A., Jung, N., Maumy-Bertrand, M., Bertrand, F., Meyer, N., ... & Bahram, S. (2013). Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia. Proceedings of the National Academy of Sciences, 110(2), 459-464.
Examples
if(require(CascadeData)){
data(micro_US)
micro_US<-as.micro_array(micro_US,time=c(60,90,210,390),subject=6)
data(micro_S)
micro_S<-as.micro_array(micro_S,time=c(60,90,210,390),subject=6)
#Basically, to find the 50 more significant expressed genes you will use:
Selection_1<-geneSelection(x=micro_S,y=micro_US,
tot.number=50,data_log=TRUE)
summary(Selection_1)
#If we want to select genes that are differentially
#at time t60 or t90 :
Selection_2<-geneSelection(x=micro_S,y=micro_US,tot.number=30,
wanted.patterns=
rbind(c(0,1,0,0),c(1,0,0,0),c(1,1,0,0)))
summary(Selection_2)
#To select genes that have a differential maximum of expression at a specific time point.
Selection_3<-genePeakSelection(x=micro_S,y=micro_US,peak=1,
abs_val=FALSE,alpha_diff=0.01)
summary(Selection_3)
}
if(require(CascadeData)){
data(micro_US)
micro_US<-as.micro_array(micro_US,time=c(60,90,210,390),subject=6)
data(micro_S)
micro_S<-as.micro_array(micro_S,time=c(60,90,210,390),subject=6)
#Genes with differential expression at t1
Selection1<-geneSelection(x=micro_S,y=micro_US,20,wanted.patterns= rbind(c(1,0,0,0)))
#Genes with differential expression at t2
Selection2<-geneSelection(x=micro_S,y=micro_US,20,wanted.patterns= rbind(c(0,1,0,0)))
#Genes with differential expression at t3
Selection3<-geneSelection(x=micro_S,y=micro_US,20,wanted.patterns= rbind(c(0,0,1,0)))
#Genes with differential expression at t4
Selection4<-geneSelection(x=micro_S,y=micro_US,20,wanted.patterns= rbind(c(0,0,0,1)))
#Genes with global differential expression
Selection5<-geneSelection(x=micro_S,y=micro_US,20)
#We then merge these selections:
Selection<-unionMicro(list(Selection1,Selection2,Selection3,Selection4,Selection5))
print(Selection)
#Prints the correlation graphics Figure 4:
summary(Selection,3)
##Uncomment this code to retrieve geneids.
#library(org.Hs.eg.db)
#
#ff<-function(x){substr(x, 1, nchar(x)-3)}
#ff<-Vectorize(ff)
#
##Here is the function to transform the probeset names to gene ID.
#
#library("hgu133plus2.db")
#
#probe_to_id<-function(n){
#x <- hgu133plus2SYMBOL
#mp<-mappedkeys(x)
#xx <- unlist(as.list(x[mp]))
#genes_all = xx[(n)]
#genes_all[is.na(genes_all)]<-"unknown"
#return(genes_all)
#}
#Selection@name<-probe_to_id(Selection@name)
}