HiC2Tree {treediff}R Documentation

Convert Hi-C to trees

Description

This function converts Hi-C data into trees, using adjClust. It takes as input a file path vector, the format of the input data, the bin size of the Hi-C array, the chromosomes to be included in the analysis, and the number of replicates. It returns a list containing all trees, metadata, index and treediff results.

Usage

HiC2Tree(files, format, binsize = NULL, index = NULL, chromosomes, replicates)

Arguments

files

A character vector containing the file paths of the input data.

format

A character vector indicating the format of the input data: "tabular", "cooler", "juicer", or "HiC-Pro".

binsize

An integer indicating the bin size of the Hi-C matrix.

index

A character indicating the path of the index for the input data. Required (and used) only with the "HiC-Pro" format.

chromosomes

A vector containing the chromosomes to be included in the analysis.

replicates

An integer indicating the number of replicates to be used in treediff.

Value

A list containing:

trees

A list of all trees.

metadata

A data frame containing the following columns: names (name of each tree), chromosome, cluster, and file.

index

A data table containing the correspondence of each bin in the genome.

testRes

A list of treediff results for each cluster.

References

Christophe Ambroise, Alia Dehman, Pierre Neuvial, Guillem Rigaill, and Nathalie Vialaneix (2019) Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics. Algorithms for Molecular Biology, 14(22), 363–389.

Examples

replicates <- 1:3
cond <- c("90", "110")
all_begins <- interaction(expand.grid(replicates, cond), sep = "-")
all_begins <- as.character(all_begins)

# single chromosome
nb_chr <- 1
chromosomes <- 1:nb_chr
all_mat_chr <- lapply(chromosomes, function(chr) {
  all_mat <- lapply(all_begins, function(ab) {
    mat_file <- paste0("Rep", ab, "-chr", chr, "_200000.bed")
  })
  all_mat <- unlist(all_mat)
})
index <- system.file("extdata", "index.200000.longest18chr.abs.bed",
                     package = "treediff")
format <- rep("HiC-Pro", length(replicates) * length(cond) * nb_chr)
binsize <- 200000
files <- system.file("extdata", unlist(all_mat_chr), package = "treediff")
replicates <- c(3, 3)
HiC2Tree(files, format, binsize, index, chromosomes, replicates)

## Not run: 
# two chromosomes
nb_chr <- 2
chromosomes <- 1:nb_chr
all_mat_chr <- lapply(chromosomes, function(chr) {
  all_mat <- lapply(all_begins, function(ab) {
    mat_file <- paste0("Rep", ab, "-chr", chr, "_200000.bed")
  })
  all_mat <- unlist(all_mat)
})
files <- system.file("extdata", unlist(all_mat_chr), package = "treediff")
format <- rep("HiC-Pro", length(replicates) * length(cond) * nb_chr)
replicates <- c(3, 3)
HiC2Tree(files, format, binsize, index, chromosomes, replicates)

## End(Not run)


[Package treediff version 0.2.1 Index]