QQ_plot {QCGWAS} | R Documentation |
QQ plot(s) of expected vs. reported p-values
Description
QQ_plot
generates a simple QQ plot of the expected and
reported p-value distribution. It includes the option to
filter the data with the high-quality filter. QQ_series
generates a series of such QQ plots for multiple filter
settings.
Usage
QQ_plot(dataset, save_name = "dataset", save_dir = getwd(),
filter_FRQ = NULL, filter_cal = NULL,
filter_HWE = NULL, filter_imp = NULL,
filter_NA = TRUE,
filter_NA_FRQ = filter_NA, filter_NA_cal = filter_NA,
filter_NA_HWE = filter_NA, filter_NA_imp = filter_NA,
p_cutoff = 0.05, plot_QQ_bands = FALSE,
header_translations,
check_impstatus = FALSE, ignore_impstatus = FALSE,
T_strings = c("1", "TRUE", "yes", "YES", "y", "Y"),
F_strings = c("0", "FALSE", "no", "NO", "n", "N"),
NA_strings = c(NA, "NA", ".", "-"), ...)
QQ_series(dataset, save_name = "dataset", save_dir = getwd(),
filter_FRQ = NULL, filter_cal = NULL,
filter_HWE = NULL, filter_imp = NULL,
filter_NA = TRUE,
filter_NA_FRQ = filter_NA, filter_NA_cal = filter_NA,
filter_NA_HWE = filter_NA, filter_NA_imp = filter_NA,
p_cutoff = 0.05, plot_QQ_bands = FALSE,
header_translations,
check_impstatus = FALSE, ignore_impstatus = FALSE,
T_strings = c("1", "TRUE", "yes", "YES", "y", "Y"),
F_strings = c("0", "FALSE", "no", "NO", "n", "N"),
NA_strings = c(NA, "NA", ".", "-"), ...)
Arguments
dataset |
a data frame containing the p-value column and (depending on the settings) columns for chromosome number, position, the quality parameters, sample size and imputation status. |
save_name |
for |
save_dir |
character string; the directory where the output files are saved. Note that R uses forward slash (/) where Windows uses the backslash (\). |
filter_FRQ , filter_cal , filter_HWE , filter_imp |
Filter threshold-values for allele-frequency, callrate,
HWE p-value and imputation quality, respectively. Passed to
|
filter_NA |
logical; if |
filter_NA_FRQ , filter_NA_cal , filter_NA_HWE , filter_NA_imp |
logical; variable-specific settings for |
p_cutoff |
numeric; the threshold of p-values to be
shown in the QQ plot(s). Higher (less significant) p-values
are excluded from the plot. The default setting is |
plot_QQ_bands |
logical; should probability bands be added to the QQ plot? |
header_translations |
translation table for column names.
See |
check_impstatus |
logical; should the imputation-status
column be passed to |
ignore_impstatus |
logical; if |
T_strings , F_strings , NA_strings |
arguments passed
to |
... |
arguments passed to |
Details
QQ_series
accepts multiple filter-values, and
passes these one by one to QQ_plot
to generate a
series of plots. For example, specifying:
filter_FRQ = c(0.05, 0.10), filter_cal = c(0.90, 0.95)
will generate two plots. The first excludes SNPs with
allele frequency < 0.05 or callrate < 0.90; the second allele
frequency < 0.10 or callrate < 0.95. The same principle
applies to the NA_filter
settings. If the vectors
submitted to the filter arguments are of unequal length, the
shorter vector will be recycled until it equals the length of
the longer (if possible). To filter missing values only, set
the filter to NA
and the corresponding NA-filter
argument to TRUE
. Setting the filter argument to
NULL
will disable the filter entirely, regardless of
the NA-filter setting.
Value
Both functions return an invisible value NULL
.
See Also
QC_plots
for generating more complex QQ plots
as well as Manhattan plots.
QC_histogram
for creating histograms.
check_P
for comparing the reported p-values to
the p expected from the effect size and standard error.
Examples
## Not run:
data("gwa_sample")
QQ_plot(dataset = gwa_sample,
save_name = "sample_QQ",
filter_FRQ = 0.01, filter_cal = 0.95,
filter_NA = FALSE)
QQ_series(dataset = gwa_sample,
save_name = "sample_QQ",
filter_FRQ = c(NA, 0.01, 0.01),
filter_cal = c(NA, 0.95, 0.95),
filter_NA = c(FALSE, FALSE, TRUE))
## End(Not run)