determine.C {RFlocalfdr}R Documentation

determine.C

Description

by assumption, there is a point q such that to the left of q, f_B sim f_0 (z). That is, there is a q such that there are only null values to the left of q. We determine q using a change point method related to penalized model selection. See Gauran, Iris Ivy M. and Park, Junyong and Lim, Johan and Park, DoHwan and Zylstra, John and Peterson, Thomas and Kann, Maricel and Spouge, John L. "Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data" Biometrics, 2018 74:2

Usage

determine.C(f_fit, df, t1, trace.plot = FALSE, start_at = 30, debug.flag = 0)

Arguments

f_fit

object returned by f.fit

df

data frame containing x and y

t1

initial estimates of xi, omega, and lambda. Generally returned by fit.to.data.set.wrapper

trace.plot

– produce a plot of each fit with a 1 second sleep. Can be watched as a movie.

start_at

– x <- f_fit$midpoints is of length 119 (quite arbitrary). We use the first start_at values of x to fit the skew-normal distribution.

debug.flag

– debugging level. If debug.flag >0 then some output is printed to the screen.

Value

– a vector of numbers of length equal to the rows in df (119 in this case). Say that this is qq. We determine the minimum value of qq. This is the value "C" such that – to the right of C, our data is generated from the NULL distribution – to the left of C, we have a mixture of the NULL and non-NULL distribution

Examples

data(imp20000)                                      
imp<-log(imp20000$importances)                               
t2<-imp20000$counts
temp<-imp[t2 > 1]   #see                          
temp<-temp[temp != -Inf]                         
temp <- temp - min(temp) + .Machine$double.eps   
f_fit <- f.fit(temp)                             
y <- f_fit$zh$density                            
x <- f_fit$midpoints                             
df <- data.frame(x, y)                           
initial.estimates <- fit.to.data.set.wrapper(df, temp, try.counter = 3,return.all=FALSE)           
initial.estimates<-  initial.estimates$Estimate

qq<- determine.C(f_fit,df,initial.estimates,start_at=37,trace.plot = FALSE)    
cc<-x[which.min(qq)]                                                                             
plot(x,qq,main="determine cc")                                                                   
abline(v=cc)
# unfortunately the minima does not appear reasonable. In this case it is advisable to use the
# 95th quantile


#needs the  chromosome 22 data in  RFlocalfdr.data. Also has a long runtime.
library(RFlocalfdr.data)
data(ch22)                                                                                    
?ch22                                                                                        
t2 <-ch22$C                                                                                   
imp<-log(ch22$imp)                                                                            
#Detemine a cutoff to get a unimodal density.                                                 
res.temp <- determine_cutoff(imp, t2 ,cutoff=c(25,30,35,40),plot=c(25,30,35,40),Q=0.75)       
plot(c(25,30,35,40),res.temp[,3])                                                             
imp<-imp[t2 > 30]
debug.flag <- 0
f_fit<- f.fit(imp,debug.flag=debug.flag,temp.dir=temp.dir)
#makes the plot histogram_of_variable_importances.png                              
y<-f_fit$zh$density                                                                                                                           
x<-f_fit$midpoints                                                                                                                                    
plot(density(imp),main="histogram and fitted spline")                                                                                     
lines(x,y,col="red")                                                                                                                      
df<-data.frame(x,y)                                                                                                                           
initial.estimates <- fit.to.data.set.wrapper(df,imp,debug.flag=debug.flag,plot.string="initial",
                                              temp.dir=temp.dir,try.counter=3)    
initial.estimates <- data.frame(summary(initial.estimates)$parameters)$Estimate                                                               
# 1.102303 1.246756 1.799169
qq<- determine.C(f_fit,df,initial.estimates,start_at=37,trace.plot = TRUE)    
cc<-x[which.min(qq)]                                                                             
plot(x,qq,main="determine cc")                                                                   
abline(v=cc)


[Package RFlocalfdr version 0.8.5 Index]