segmentByPairedPSCBS {PSCBS} | R Documentation |
Segment total copy numbers and allele B fractions using the Paired PSCBS method
Description
Segment total copy numbers and allele B fractions using the Paired PSCBS method [1]. This method requires matched normals. This is a low-level segmentation method. It is intended to be applied to one tumor-normal sample at the time.
Usage
## Default S3 method:
segmentByPairedPSCBS(CT, thetaT=NULL, thetaN=NULL, betaT=NULL, betaN=NULL, muN=NULL,
rho=NULL, chromosome=0, x=NULL, alphaTCN=0.009, alphaDH=0.001, undoTCN=0, undoDH=0,
..., avgTCN=c("mean", "median"), avgDH=c("mean", "median"),
flavor=c("tcn&dh", "tcn,dh", "sqrt(tcn),dh", "sqrt(tcn)&dh", "tcn"), tbn=is.null(rho),
joinSegments=TRUE, knownSegments=NULL, dropMissingCT=TRUE, seed=NULL, verbose=FALSE,
preserveScale=FALSE)
Arguments
CT |
A |
thetaT , thetaN |
(alternative) As an alternative to specifying
tumor TCN ratios relative to the match normal by
argument |
betaT |
A |
betaN |
A |
muN |
An optional |
rho |
(alternative to |
chromosome |
(Optional) An |
x |
Optional |
alphaTCN , alphaDH |
The significance levels for segmenting total copy numbers (TCNs) and decrease-in-heterozygosity signals (DHs), respectively. |
undoTCN , undoDH |
Non-negative |
avgTCN , avgDH |
A |
... |
Additional arguments passed to |
flavor |
A |
tbn |
If |
joinSegments |
If |
knownSegments |
Optional |
dropMissingCT |
If |
seed |
An (optional) |
verbose |
See |
preserveScale |
Defunct - gives an error is specified. |
Details
Internally segmentByCBS
() is used for segmentation.
The Paired PSCBS segmentation method does not support weights.
Value
Returns the segmentation results as a PairedPSCBS
object.
Reproducibility
The "DNAcopy::segment" implementation of CBS uses approximation through random sampling for some estimates. Because of this, repeated calls using the same signals may result in slightly different results, unless the random seed is set/fixed.
Whole-genome segmentation is preferred
Although it is possible to segment each chromosome independently using Paired PSCBS, we strongly recommend to segment whole-genome (TCN,BAF) data at once. The reason for this is that downstream CN-state calling methods, such as the AB and the LOH callers, performs much better on whole-genome data. In fact, they may fail to provide valid calls if done chromosome by chromosome.
Missing and non-finite values
The total copy number signals as well as any optional positions
must not contain missing values, i.e. NA
s or NaN
s.
If there are any, an informative error is thrown.
Allele B fractions may contain missing values, because such are
interpreted as representing non-polymorphic loci.
None of the input signals may have infinite values, i.e. -Inf
or +Inf
.
If so, an informative error is thrown.
Paired PSCBS with only genotypes
If allele B fractions for the matched normal (betaN
) are
not available, but genotypes (muN
) are, then it is possible
to run a version of Paired PSCBS where TumorBoost normalization
of the tumor allele B fractions is skipped. In order for this
to work, argument tbn
must be set to FALSE
.
Author(s)
Henrik Bengtsson
References
[1] A.B. Olshen, H. Bengtsson, P. Neuvial, P.T. Spellman, R.A. Olshen, V.E. Seshan, Parent-specific copy number in paired tumor-normal studies using circular binary segmentation, Bioinformatics, 2011
[2] H. Bengtsson, P. Neuvial and T.P. Speed, TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays, BMC Bioinformatics, 2010
See Also
Internally, callNaiveGenotypes
is used to
call naive genotypes, normalizeTumorBoost
is
used for TumorBoost normalization, and segmentByCBS
() is used
to segment TCN and DH separately.
To segment tumor total copy numbers and allele B fractions
without a matched normal, see segmentByNonPairedPSCBS
().
To segment total copy-numbers, or any other unimodal signals,
see segmentByCBS
().
Examples
verbose <- R.utils::Arguments$getVerbose(-10*interactive(), timestamp=TRUE)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Load SNP microarray data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
data <- PSCBS::exampleData("paired.chr01")
str(data)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Paired PSCBS segmentation
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Drop single-locus outliers
dataS <- dropSegmentationOutliers(data)
# Speed up example by segmenting fewer loci
dataS <- dataS[seq(from=1, to=nrow(data), by=10),]
str(dataS)
R.oo::attachLocally(dataS)
# Paired PSCBS segmentation
fit <- segmentByPairedPSCBS(CT, betaT=betaT, betaN=betaN,
chromosome=chromosome, x=x,
seed=0xBEEF, verbose=verbose)
print(fit)
# Plot results
plotTracks(fit)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Bootstrap segment level estimates
# (used by the AB caller, which, if skipped here,
# will do it automatically)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
fit <- bootstrapTCNandDHByRegion(fit, B=100, verbose=verbose)
print(fit)
plotTracks(fit)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Calling segments in allelic balance (AB)
# NOTE: Ideally, this should be done on whole-genome data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Explicitly estimate the threshold in DH for calling AB
# (which be done by default by the caller, if skipped here)
deltaAB <- estimateDeltaAB(fit, flavor="qq(DH)", verbose=verbose)
print(deltaAB)
## [1] 0.1657131
fit <- callAB(fit, delta=deltaAB, verbose=verbose)
print(fit)
plotTracks(fit)
# Even if not explicitly specified, the estimated
# threshold parameter is returned by the caller
stopifnot(fit$params$deltaAB == deltaAB)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Calling segments in loss-of-heterozygosity (LOH)
# NOTE: Ideally, this should be done on whole-genome data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Explicitly estimate the threshold in C1 for calling LOH
# (which be done by default by the caller, if skipped here)
deltaLOH <- estimateDeltaLOH(fit, flavor="minC1|nonAB", verbose=verbose)
print(deltaLOH)
## [1] 0.625175
fit <- callLOH(fit, delta=deltaLOH, verbose=verbose)
print(fit)
plotTracks(fit)
# Even if not explicitly specified, the estimated
# threshold parameter is returned by the caller
stopifnot(fit$params$deltaLOH == deltaLOH)