NGS.CNV {saasCNV}R Documentation

CNV Analysis Pipeline for WGS and WES Data

Description

All analysis steps are integrate into a pipeline. The results, including visualization plots are placed in a directory as specified by user.

Usage

NGS.CNV(vcf, output.dir, sample.id, 
    do.GC.adjust = FALSE, 
    gc.file = system.file("extdata","GC_1kb_hg19.txt.gz",package="saasCNV"), 
    min.chr.probe = 100, min.snps = 10, 
    joint.segmentation.pvalue.cutoff = 1e-04, max.chpts = 30, 
    do.merge = TRUE, use.null.data = TRUE, 
    num.perm = 1000, maxL = NULL, 
    merge.pvalue.cutoff = 0.05, 
    do.cnvcall.on.merge = TRUE, 
    cnvcall.pvalue.cutoff = 0.05, 
    do.plot = TRUE, cex = 0.3, ref.num.probe = NULL, 
    do.gene.anno = FALSE, 
    gene.anno.file = NULL,
    seed = NULL, 
    verbose = TRUE)

Arguments

vcf

a data frame constructed from a vcf file. See vcf2txt.

output.dir

the directory to which all the results will be located.

sample.id

sample ID to be displayed in the data frame of the results and the title of some diagnosis plots.

do.GC.adjust

logical. If GC content adjustment on log2ratio to be carried out. Default is FALSE. See GC.adjust for details.

gc.file

the location of tab-delimit file with GC content (averaged per 1kb window) information. See GC.adjust for details.

min.chr.probe

the minimum number of probes tagging a chromosome for it to be passed to the subsequent analysis.

min.snps

the minimum number of probes a segment needs to span.

joint.segmentation.pvalue.cutoff

the p-value cut-off one (or a pair) of change points to be determined as significant in each cycle of joint segmentation.

max.chpts

the maximum number of change points to be detected for each chromosome.

do.merge

logical. If segments merging step to be carried out. Default is TRUE.

use.null.data

logical. If only data for probes located in normal copy segments to be used for bootstrapping. Default is TRUE. If a more aggressive merging is needed, it can be switched to FALSE.

num.perm

the number of replicates drawn by bootstrap.

maxL

integer. The maximum length in terms of number of probes a bootstrapped segment may span. Default is NULL. If NULL, It will be automatically specified as 1/100 of the number of data points.

merge.pvalue.cutoff

a p-value cut-off for merging. If the empirical p-value is greater than the cut-off value, the two adjacent segments under consideration will be merged.

do.cnvcall.on.merge

logical. If CNV call to be done for the segments after merging step. Default is TRUE. If TRUE, CNV call will be done on the segments resulting directly from joint segmentation without merging step.

cnvcall.pvalue.cutoff

a p-value cut-off for CNV calling.

do.plot

logical. If diagnosis plots to be output. Default is TRUE.

cex

a numerical value giving the amount by which plotting text and symbols should be magnified relative to the default. It can be adjusted in order to make the plot legible.

ref.num.probe

integer. The reference number of probes against which a segment is compared in order to determine the cex of the segment to be displayed. Default is NULL. If NULL, It will be automatically specified as 1/100 of the number of data points.

do.gene.anno

logical. If gene annotation step to be performed. Default is FALSE.

gene.anno.file

a tab-delimited file containing gene annotation information. For example, RefSeq annotation file which can be found at UCSC genome browser.

seed

integer. Random seed can be set for reproducibility of results.

verbose

logical. If more details to be output. Default is TRUE.

Details

See the vignettes of the package for more details.

Value

The results, including visualization plots are placed in subdirectories of the output directory output.dir as specified by user.

Author(s)

Zhongyang Zhang <zhongyang.zhang@mssm.edu>

References

Zhongyang Zhang and Ke Hao. (2015) SAAS-CNV: A Joint Segmentation Approach on Aggregated and Allele Specific Signals for the Identification of Somatic Copy Number Alterations with Next-Generation Sequencing Data. PLoS Computational Biology, 11(11):e1004618.

See Also

vcf2txt, cnv.data, joint.segmentation, merging.segments cnv.call, diagnosis.seg.plot.chr, genome.wide.plot, diagnosis.cluster.plot

Examples

## Not run: 
## NGS pipeline analysis
## download vcf_table.txt.gz
url <- "https://zhangz05.u.hpc.mssm.edu/saasCNV/data/vcf_table.txt.gz"
tryCatch({download.file(url=url, destfile="vcf_table.txt.gz")
         }, error = function(e) {
          download.file(url=url, destfile="vcf_table.txt.gz", method="curl")
         })
## If download.file fails to download the data, please manually download it from the url.

vcf_table <- read.delim(file="vcf_table.txt.gz", as.is=TRUE)

## download refGene_hg19.txt.gz
url <- "https://zhangz05.u.hpc.mssm.edu/saasCNV/data/refGene_hg19.txt.gz"
tryCatch({download.file(url=url, destfile="refGene_hg19.txt.gz")
         }, error = function(e) {
          download.file(url=url, destfile="refGene_hg19.txt.gz", method="curl")
         })
## If download.file fails to download the data, please manually download it from the url.

sample.id <- "WES_0116"
output.dir <- file.path(getwd(), "test_saasCNV")

NGS.CNV(vcf=vcf_table, output.dir=output.dir, sample.id=sample.id,
        min.chr.probe=100,
        min.snps=10,
        joint.segmentation.pvalue.cutoff=1e-4,
        max.chpts=30,
        do.merge=TRUE, use.null.data=TRUE, num.perm=1000, maxL=2000, 
        merge.pvalue.cutoff=0.05,
        do.cnvcall.on.merge=TRUE, 
        cnvcall.pvalue.cutoff=0.05,
        do.plot=TRUE, cex=0.3, ref.num.probe=1000,
        do.gene.anno=TRUE,
        gene.anno.file="refGene_hg19.txt.gz",
        seed=123456789,
        verbose=TRUE)

## End(Not run)


[Package saasCNV version 0.3.4 Index]