sequentialSearch_dir {PeakSegDisk}R Documentation

Compute PeakSeg model with given number of peaks

Description

Compute the most likely peak model with at most the number of peaks given by peaks.int. This function repeated calls PeakSegFPOP_dir with different penalty values, until either (1) it finds the peaks.int model, or (2) it concludes that there is no peaks.int model, in which case it returns the next simplest model (with fewer peaks than peaks.int). The first pair of penalty values (0, Inf) is run in parallel via the user-specified future plan, if the future.apply package is available.

Usage

sequentialSearch_dir(problem.dir, 
    peaks.int, verbose = 0)

Arguments

problem.dir

problemID directory in which coverage.bedGraph has already been computed. If there is a labels.bed file then the number of incorrect labels will be computed in order to find the target interval of minimal error penalty values.

peaks.int

int: target number of peaks.

verbose

numeric verbosity: if >0 then cat is used to print a message for each penalty.

Value

Same result list from PeakSegFPOP_dir, with an additional component "others" describing the other models that were computed before finding the optimal model with peaks.int (or fewer) peaks. Additional loss columns are as follows: under=number of peaks in smaller model during binary search; over=number of peaks in larger model during binary search; iteration=number of times PeakSegFPOP has been run.

Author(s)

Toby Dylan Hocking

Examples


## Create simple 6 point data set discussed in supplementary
## materials. GFPOP/GPDPA computes up-down model with 2 peaks, but
## neither CDPA (PeakSegDP::cDPA) nor PDPA (jointseg)
r <- function(chrom, chromStart, chromEnd, coverage){
  data.frame(chrom, chromStart, chromEnd, coverage)
}
supp <- rbind(
  r("chr1", 0, 1,  3),
  r("chr1", 1, 2, 9),
  r("chr1", 2, 3, 18),
  r("chr1", 3, 4, 15),
  r("chr1", 4, 5, 20),
  r("chr1", 5, 6, 2)
)
data.dir <- file.path(tempfile(), "chr1-0-6")
dir.create(data.dir, recursive=TRUE)
write.table(
  supp, file.path(data.dir, "coverage.bedGraph"),
  sep="\t", row.names=FALSE, col.names=FALSE)

## register a parallel future plan to compute the first two
## penalties in parallel during the sequential search.
if(interactive() && requireNamespace("future"))future::plan("multisession")

## Compute optimal up-down model with 2 peaks via sequential search.
fit <- PeakSegDisk::sequentialSearch_dir(data.dir, 2L)

if(require(ggplot2)){
  ggplot()+
    theme_bw()+
    geom_point(aes(
      chromEnd, coverage),
      data=supp)+
    geom_segment(aes(
      chromStart+0.5, mean,
      xend=chromEnd+0.5, yend=mean),
      data=fit$segments,
      color="green")
}


[Package PeakSegDisk version 2023.11.27 Index]