calibration.plot {cp4p} | R Documentation |
Displaying the "Calibration Plot" of a vector of p-values.
Description
From a proteomics viewpoint, this function displays a graph (the "Calibration Plot") which allows to visually assess the compliance of a differential abundance analysis with FDR control procedure assumptions.
From a statistical viewpoint, this function performs a plot of the cumulative distribution function of 1-p-values. It allows checking whether p-values respect several assumptions of FDR control procedures.
Usage
calibration.plot(p, pi0.method = "pounds", nbins = 20, pz = 0.05)
Arguments
p |
Numeric vector of raw p-values. Raw p-values are assumed without missing values, and between 0 and 1. |
pi0.method |
Numeric value between 0 and 1 corresponding to the proportion of true null hypotheses if known by the user, or the name of an estimation method among |
nbins |
Number of bins. Parameter used for the |
pz |
P-value threshold such as p-values below are associated to false null hypotheses. Used for the |
Details
This function provides a graph which displays the cumulative distribution function of 1-p-values as a function of 1-p-values (black curve) as advocated by Schweder and Spjotvoll (1982).
The blue straight line has a slope equals to the proportion of true null hypotheses (estimated by estim.pi0
) that is recalled in the caption of the plot. It is close to the black curve for small 1-pvalues if the p-values are independently and uniformly distributed under the null hypothesis.
In addition, two other measures are given in the caption of the graphic. Each has a color that matches that of various areas of the plot and should be carefully consider to assess the well-calibration of p-values (see Giai Gianetto et al. (2016) for details).
The first measure corresponds to one minus the ratio between the green area and the grey area (referred to as "differentially abundant protein concentration"). The closer to 100% this measure is, the smaller the false nondiscovery rate is expected.
The second measure corresponds to the total red area observed on the graph (referred to as "uniformity underestimation"). The smaller this measure is, the more the proportion of true null hypotheses is expected to be not under-estimated (so as to get a conservative p-value adjustment).
Supplementary theoretical justifications on these measures can be found in the tutorial available in the supplementary material of Giai Gianetto et al. (2016).
Value
A list composed of :
pi0 |
Numeric value corresponding to the proportion of true null hypotheses (non-differentially abundant proteins or peptides) used for the plot. Numeric vector if |
h1.concentration |
Numeric value corresponding to one minus the ratio between the green area and the grey area. NULL if |
unif.under |
Numeric value corresponding to the total red area observed on the graph (multiplied by 100). NULL if |
Author(s)
Quentin Giai Gianetto <quentin2g@yahoo.fr>
References
Giai Gianetto, Q., Combes, F., Ramus, C., Bruley, C., Couté, Y., Burger, T. (2016). Calibration plot for proteomics: A graphical tool to visually check the assumptions underlying FDR control in quantitative experiments. Proteomics, 16(1), 29-32.
Schweder, T., Spjotvoll, E. (1982). Plots of p-values to evaluate many tests simultaneously. Biometrika, 69(3), 493-502.
See Also
Examples
#get p-values
data(LFQRatio25)
p=LFQRatio25[,7]
#Plot straight lines whose slopes correspond to different estimates of
#the proportion of true null hypotheses
r=calibration.plot(p, pi0.method="ALL")
r$pi0
#Plot of the graph with the "pounds" method (default)
r=calibration.plot(p)
#Estimate of the proportion of true null hypotheses
r$pi0
#Estimate of the differentially abundant protein concentration
#(the closer to one, the better)
r$h1.concentration
#Estimate of the "uniformity underestimation" quantity
#(If null, pi0 is not underestimated.)
r$unif.under
#Plot of the graph using the "slim" method
r=calibration.plot(p, pi0.method="slim")