dhatL2 {LPBkg} | R Documentation |
CD-plot and adjusted deviance test
Description
Construction of CD-plot and adjusted deviance test. The confidence bands are also adjusted for post-selection inference.
Usage
dhatL2(data, g, M = 6, Mmax = NULL, smooth = TRUE,
criterion = "AIC", hist.u = TRUE, breaks = 20, ylim = c(0, 2.5),
range = c(min(data),max(data)), sigma = 2)
Arguments
data |
A vector of data. See details. |
g |
The postulated model from which we want to assess if deviations occur. |
M |
The desired size of the polynomial basis to be used. |
Mmax |
The maximum size of the polynomial basis from which |
smooth |
A logical argument indicating if a denoised solution should be implemented. The default is |
criterion |
If |
hist.u |
A logical argument indicating if the CD-plot should be displayed or not. The default is |
breaks |
If |
ylim |
If |
range |
Range of the data/search region considered. |
sigma |
The significance level (in sigmas) with respect to which the confidence bands should be constructed. See details. |
Details
The argument data
collects the data for which we want to test if its distribution deviates from the one of the postulated model specified in the argument g
. In Algeri, 2019, the sample specified under data
corresponds to the source-free sample in the background calibration phase and to the physics sample in the signal search phase.
The value M
selected determines the smoothness of the estimated comparison density, with smaller values of M
leading to smoother estimates. The deviance test is used to select the value M
which leads to the most significant deviation from the postulated model. The default value for Mmax
is set to 20
. Notice that numerical issues may
arise for larger values of Mmax
.
If smooth=TRUE
the largest coefficient estimates are selected according to either the AIC or BIC criterion as described in Algeri, 2019 and Mukhopadhyay, 2017.
If Mmax>1
and/or smooth=TRUE
, post-selection Bonferroni's correction is automatically implemented to both the deviance test p-value and the confidence bands. The desired level of significance can be expressed as one minus the cdf of a standard normal evaluated at sigma
(see Algeri, 2019).
Value
Deviance |
Value of the deviance test statistic. |
Dev_pvalue |
Unadjusted p-value of the deviance test. |
Dev_adj_pvalue |
Post-selection Bonferroni adjusted p-value of the deviance test. |
kstar |
Number of coefficients selected by the denoising process. If |
dhat |
Function corresponding to the estimated comparison density in the u domain. |
dhat.x |
Function corresponding to the estimated comparison density in the x domain. |
SE |
Function corresponding to the estimated standard errors of the comparison density in the u domain. |
LBf1 |
Function corresponding to the lower bound of the confidence bands under in u domain. |
UBf1 |
Function corresponding to the upper bound of the confidence bands in u domain. |
f |
Function corresponding to the estimated density of the data. |
u |
Vector of values corresponding to the cdf of the model specified in |
LP |
Estimates of the coefficients. |
G |
Cumulative density function of the postulated model specified in the argument |
Author(s)
Sara Algeri
References
S. Algeri, 2019. Detecting new signals under background mismodelling. <arXiv:1906.06615>.
S. Mukhopadhyay, 2017. Large-scale mode identification and data-driven sciences. Electronic Journal of Statistics 11 (2017), no. 1, 215–240.
See Also
Examples
#generaing data
x<-rnorm(1000,10,7)
xx<-x[x>=10 & x<=20]
#create suitable postulated quantile function of data
G<-pnorm(20,5,15)-pnorm(10,5,15)
g<-function(x){dnorm(x,5,15)/G}
#Choose best M
Mmax=20
range=c(10,20)
m<-BestM(data=xx,g, Mmax,range)
# vectorize postulated quantile function
g<-Vectorize(g)
u<-g(xx)
#M has to be sufficient big, otherwise dhatL2 function will crush.
#So,here we set m eqaul 6 as an example
m<-6
comp.density<-dhatL2(data=xx,g, M=m, Mmax=Mmax,smooth=FALSE,criterion="AIC",hist.u=TRUE,breaks=20,
ylim=c(0,2.5),range=range,sigma=2)