R: Permutation Based Non-Parametric Analysis of CRISPR Screen...

PBNPA {PBNPA}

R Documentation

Permutation Based Non-Parametric Analysis of CRISPR Screen Data

Description

This function reads the raw read count data and conducts statistical analysis for permutation based non-parametric analysis of CRISPR screen data.

Usage

PBNPA(dat, sim.no = 10, alpha.threshold = 0.2, fdr = 0.05)

Arguments

`dat`	List type with each element being the raw read count data for one replicate. Each element should be a dataframe with four columns. The first column is named 'sgRNA' which is the sgRNA index; the second column is named 'Gene' which is the gene index; the third column should be the initial read count or control read count and the fourth column should be the final read count or treatment read count. Missing values in the read count are replaced with 0.
`sim.no`	Number of permutations used to get the un-adjusted p-value.Set to 10 by default.
`alpha.threshold`	Threshold to remove genes with significant p-values. Set to 0.2 by default.
`fdr`	The FDR threshold to determine the selected genes. Set to 0.05 by default.

Details

PBNPA implements permutation based non-parametric analysis of CRISPR screen data. Details about this algorithm are published in the following paper published on BMC genomics, Jia et al. (2017) <doi:10.1186/s12864-017-3938-5>: A permutation-based non-parametric analysis of CRISPR screen data. Please cite this paper if you use this algorithm for your paper.

Value

A list of 5 elements will be returned. The first element is pos.gene, which is the index of genes identified as hits for positive screen by controlling FDR at the selected level; the second element is pos.number, which is the number of genes identified as hits for positive screen; The third element is neg.gene, which is the index of genes identified as hits for negative screen by controlling FDR at the selected level; the fourth element is neg.number, which is the number of genes identified as hits for negative screen; the fifth element is a dataframe which contains unadjusted p-values and FDR adjusted p-values for all the genes (for both negative selection and positive selection).

Examples

dat11 = system.file('extdata','simdata_20per_off50.csv', package='PBNPA')
dat22 = system.file('extdata','simdata_20per_off49.csv', package='PBNPA')
dat33 = system.file('extdata','simdata_20per_off48.csv', package='PBNPA')
dat1 = read.csv(dat11, header = TRUE)
dat2 = read.csv(dat22, header = TRUE)
dat3 = read.csv(dat33, header = TRUE)
datlist = list(dat1, dat2, dat3)
result = PBNPA(datlist)

[Package PBNPA version 0.0.3 Index]