carpools.read.distribution {caRpools}R Documentation

QC: Plot Readcount Distribution

Description

A distribution for NGS data readcount can be created by 'carpools.read.distribution' to visualize how the data set is distributed. This allows to check for data skewness and to estimate the overall assay quality. For further details see '?carpools.read.distribution'.

Usage

carpools.read.distribution(dataset,namecolumn=1, fullmatchcolumn=2, breaks="",
title="Title", xlab="X-Axis", ylab="Y-Axis",statistics=TRUE,
col=rgb(0, 0, 0, alpha = 0.65), extractpattern=expression("^(.+?)_.+"),
plotgene=NULL, type="distribution", logscale=TRUE)

Arguments

dataset

Data frame of read-count data as created by load.file(). *Default* none *Values* A data frame

namecolumn

In which column are the sgRNA identifiers? *Default* 1 *Values* column number (numeric)

fullmatchcolumn

In which column are the read counts? *Default* 2 *Values* column number (numeric)

breaks

Histogramm breaks see '?hist'. By default, will be calculated according to the dataset length. *Default* NULL *Values* (numeric)

title

Main title of plot *Default* "Title" *Values* "The title you want" (character)

xlab

Label of X-Axis *Default* "X-Axis" *Values* "Label of X-Axis" (character)

ylab

Label of Y-Axis *Default* "Y-Axis" *Values* "Label of Y-Axis" (character)

statistics

Whether basic stattistics will be shown in the plot. *Default* TRUE *Values* TRUE, FALSE (boolean)

col

The color of the plotted data. Can be any R color or RGB object. See ?rgb() for further information. *Default* rgb(0, 0, 0, alpha = 0.65) *Values* Any R color name or RGB color object (character OR color object)

extractpattern

PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in **AAK1_107_0** it will extract **AAK1**, since this is the gene identifier beloning to this sgRNA identifier. **Please see: Read-Count Data Files** *Default* expression("^(.+?)(_.+)"), will work for most available libraries. *Values* PERL regular expression with parenthesis indicating the gene identifier (expression)

plotgene

You can only plot the read count distribution of sgRNAs belonging to a certain gene, which is given to the function via plotgene. *Default* NULL *Value* NULL or gene identifier (character)

type

You can plot either the read count distribution either as a normal histogram, or a box-and-whisker plot. *Default* "distribution" *Values* "distribution" to plot a histogram, or "whisker" to plot a whisker plot (character)

logscale

Indicates whether the read-count is plotted in a logarithmic scale. *Default* TRUE *Values* TRUE, FALSE (boolean)

Details

none

Value

plot.read.distribution return a generic plot, that can be passed on to any device.

Note

none

Author(s)

Jan Winter

Examples

data(caRpools)

carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE) 
  
carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE,
  type="whisker") 


[Package caRpools version 0.83 Index]