check_P {QCGWAS} | R Documentation |
Checking GWAS p-values
Description
A simple test to check if the reported p-values in a GWAS results file match the other statistics. This function calculates an expected p-value (from the effect size and standard error) and then correlates it with the actual, reported p-value.
Usage
check_P(dataset, HQ_subset,
plot_correlation = FALSE, plot_if_threshold = FALSE,
threshold_r = 0.99,
save_name = "dataset", save_dir = getwd(),
header_translations,
use_log = FALSE, dataN = nrow(dataset), ...)
Arguments
dataset |
table with at least three columns: p-value, effect size and standard error. |
HQ_subset |
an optional logical or numeric vector
indicating the rows in |
plot_correlation |
logical; should a scatterplot of
the reported vs. calculated p-values be made? If |
plot_if_threshold |
logical; if |
threshold_r |
numeric; the correlation threshold for the scatterplot. |
save_name |
character string; the filename, without extension, for the scatterplot. |
save_dir |
character string; the directory where the output files are saved. Note that R uses forward slash (/) where Windows uses backslash (\). |
header_translations |
translation table for column names
See |
use_log , dataN |
arguments used by |
... |
arguments passed to |
Details
check_P
calculates the expected p-value by taking the
chi-square (1 degree of freedom) of the effect size divided by
the standard error squared.
In a typical GWAS dataset, the expected and observed p-values should correlate perfectly. If this isn't the case, the problem either lies in a misidentified column, or the wrong values were used when generating the dataset.
Value
The correlation between expected and reported p-values.
Examples
data("gwa_sample")
selected_SNPs <- HQ_filter(data = gwa_sample,
FRQ_val = 0.05,
cal_val = 0.95,
filter_NA = FALSE)
# To calculate a correlation between predicted and actual p-values:
check_P(gwa_sample, HQ_subset = selected_SNPs,
plot_correlation = FALSE)
# To plot the correlation:
## Not run:
check_P(gwa_sample, HQ_subset = selected_SNPs,
plot_correlation = TRUE, plot_if_threshold = FALSE,
save_name = "sample")
## End(Not run)