PeakSegFPOP_file {PeakSegDisk} | R Documentation |
PeakSegFPOP using disk storage
Description
Run the PeakSeg Functional Pruning Optimal Partitioning algorithm,
using a file on disk to store the O(N) function piece lists,
each of size O(log N).
This is a low-level function that just runs the algo
and produces the result files (without reading them into R),
so normal users are recommended to instead use PeakSegFPOP_dir
,
which calls this function then reads the result files into R.
Usage
PeakSegFPOP_file(bedGraph.file,
pen.str, db.file = NULL)
Arguments
bedGraph.file |
character scalar: tab-delimited tabular text file with four columns: chrom, chromStart, chromEnd, coverage. The algorithm creates a large temporary file in the same directory, so make sure that there is disk space available on that device. |
pen.str |
character scalar that can be converted to a numeric scalar via
as.numeric: non-negative penalty. More penalty means fewer
peaks. "0" and "Inf" are OK. Character is required rather than
numeric, so that the user can reliably find the results in the
output files, which are in the same directory as |
db.file |
character scalar: file for writing temporary cost function database – there will be a lot of disk writing to this file. Default NULL means to write the same disk where the input bedGraph file is stored; another option is tempfile() which may result in speedups if the input bedGraph file is on a slow network disk and the temporary storage is a fast local disk. |
Value
A named list of input parameters, and the temporary cost function database file size in megabytes.
Author(s)
Toby Dylan Hocking
Examples
r <- function(chrom, chromStart, chromEnd, coverage){
data.frame(chrom, chromStart, chromEnd, coverage)
}
four <- rbind(
r("chr1", 0, 10, 2),
r("chr1", 10, 20, 10),
r("chr1", 20, 30, 14),
r("chr1", 30, 40, 13))
dir.create(prob.dir <- tempfile())
coverage.bedGraph <- file.path(prob.dir, "coverage.bedGraph")
write.table(
four, coverage.bedGraph,
sep="\t", row.names=FALSE, col.names=FALSE)
pstr <- "10.5"
result.list <- PeakSegDisk::PeakSegFPOP_file(coverage.bedGraph, pstr)
dir(prob.dir)
## segments file can be read to see optimal segment means.
outf <- function(suffix){
paste0(coverage.bedGraph, "_penalty=", pstr, suffix)
}
segments.bed <- outf("_segments.bed")
seg.df <- read.table(segments.bed)
names(seg.df) <- col.name.list$segments
seg.df
## loss file can be read to see optimal Poisson loss, etc.
loss.tsv <- outf("_loss.tsv")
loss.df <- read.table(loss.tsv)
names(loss.df) <- col.name.list$loss
loss.df