PSSeg {jointseg} | R Documentation |
Parent-Specific copy number segmentation
Description
This function splits (bivariate) copy number signals into parent-specific (PS) segments using recursive binary segmentation
Usage
PSSeg(data, method, stat = NULL, dropOutliers = TRUE,
rankTransform = FALSE, ..., profile = FALSE, verbose = FALSE)
Arguments
data |
Data frame containing the following columns:
These data are assumed to be ordered by genome position. |
method |
|
stat |
A vector containing the names or indices of the columns of
|
dropOutliers |
If TRUE, outliers are droped by using DNAcopy package |
rankTransform |
If TRUE, data are replaced by their ranks before segmentation |
... |
Further arguments to be passed to |
profile |
Trace time and memory usage ? |
verbose |
A |
Details
Before segmentation, the decrease in heterozygosity d=2|b-1/2|
defined
in Bengtsson et al, 2010 is calculated from the input data. d
is only
defined for heterozygous SNPs, that is, SNPs for which
data$genotype==1/2
. d
may be seen as a "mirrored" version of
allelic ratios (b
): it converts them to a piecewise-constant signals
by taking advantage of the bimodality of b
for heterozygous SNPs. The
rationale for this transformation is that allelic ratios (b
) are only
informative for heterozygous SNPs (see e.g. Staaf et al, 2008).
Before segmentation, the outliers in the copy number signal are droped according the method explained by Venkatraman, E. S. and Olshen, A. B., 2007.
The resulting data are then segmented using the jointSeg
function, which combines an initial segmentation according to argument
method
and pruning of candidate change points by dynamic programming
(skipped when the initial segmentation *is* dynamic programming).
If argument stat
is not provided, then dynamic programming is run on
the two dimensional statistic "(c,d)"
.
If argument stat
is provided, then dynamic programming is run on
stat
; in this case we implicitly assume that stat
is a
piecewise-constant signal.
Value
A list with elements
- bestBkp
Best set of breakpoints after dynamic programming
- initBkp
Results of the initial segmentation, using 'doNnn', where 'Nnn' corresponds to argument
method
- dpBkpList
Results of dynamic programming, a list of vectors of breakpoint positions for the best model with k breakpoints for k=1, 2, ... K where
K=length(initBkp)
- prof
a
matrix
providing time usage (in seconds) and memory usage (in Mb) for the main steps of the program. Only defined if argumentprofile
is set toTRUE
Author(s)
Morgane Pierre-Jean and Pierre Neuvial
References
Bengtsson, H., Neuvial, P., & Speed, T. P. (2010). TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays. BMC bioinformatics, 11(1), 245.
Staaf, J., Lindgren, D., Vallon-Christersson, et al. (2008). Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays. Genome Biol, 9(9), R136.
Pierre-Jean, M, Rigaill, G. J. and Neuvial, P. (2015). "Performance Evaluation of DNA Copy Number Segmentation Methods." *Briefings in Bioinformatics*, no. 4: 600-615.
See Also
Examples
## load known real copy number regions
affyDat <- acnr::loadCnRegionData(dataSet="GSE29172", tumorFraction=0.5)
## generate a synthetic CN profile
K <- 10
len <- 1e4
sim <- getCopyNumberDataByResampling(len, K, regData=affyDat)
datS <- sim$profile
## run binary segmentation (+ dynamic programming)
resRBS <- PSSeg(data=datS, method="RBS", stat=c("c", "d"), K=2*K, profile=TRUE)
resRBS$prof
getTpFp(resRBS$bestBkp, sim$bkp, tol=5)
plotSeg(datS, breakpoints=list(sim$bkp, resRBS$bestBkp))