plot_distribution {QCGWAS}R Documentation

GWAS effect-Size distribution plot

Description

This function generates the effect-size distribution boxplot created by QC_series.

Usage

plot_distribution(data_table,
                  names = 1:ncol(data_table),
                  include = TRUE,
                  plot_order = 1:ncol(data_table),
                  quantile_lines = FALSE,
                  save_name = "Graph_distribution",
                  save_dir = getwd(), ...)

Arguments

data_table

table with a column of effect sizes for every dataset.

names

vector; the names for the datasets, for use in the graph. Note: it's best to keep these very short, as long labels won't be plotted. The default is the column numbers of data_table.

include

logical vector indicating which columns of data_table are included in the plot. The default setting is to include all.

plot_order

numeric vector determining the left-to-right order of plotting the datasets (columns). QC_series uses the sample size for this.

quantile_lines

logical; should lines representing the median and quartile values be included?

save_name

character string; the filename, without extension, for the graph file.

save_dir

character string; the directory where the graph is saved. Note that R uses forward slash (/) where Windows uses backslash (\).

...

arguments passed to boxplot.

Details

When running a QC over multiple files, QC_series collects the values of the effectsize_HQ output of QC_GWAS in a table, which is then passed to this function. If there are significant differences in the distribution of effect sizes, it usually indicates that the datasets did not use the same model or unit.

Value

An invisible NULL.

Note

There is a known bug with this function when called by QC_series. As input for names, QC_series pastes together a shortened filename and a "N = x" string giving the dataset's sample size.

The filenames are truncated to the first unique element; e.g. files "cohortX_male_HB.txt" and "cohortX_female_HB.txt" become "cohortX_male; N = 608" and "cohortX_female; N = 643", respectively. However, if the unique element is longer than approx. 15 characters, the label is too long to be plotted. The only solution is to change the filenames prior to passing the files to QC_series.

See Also

For comparing reported to expected effect-size distribution: QC_histogram.

For other plots comparing GWAS results files: plot_precision and plot_skewness.

Examples

## Not run: 
  data("gwa_sample")

  chunk1 <- gwa_sample$EFFECT[1:1000]
  chunk2 <- gwa_sample$EFFECT[1001:2000]
  chunk3 <- gwa_sample$EFFECT[2001:3000]

  plot_distribution(
    data_table = data.frame(chunk1, chunk2, chunk3),
    names = c("chunk 1", "chunk 2", "chunk 3"),
    quantile_lines = TRUE,
    save_name = "sample_distribution")
    
## End(Not run)

[Package QCGWAS version 1.0-9 Index]