R: ChIP-seq Samples for H3K27Ac in Human Lymphoblastoid Cell...

H3K27Ac {MAnorm2}

R Documentation

ChIP-seq Samples for H3K27Ac in Human Lymphoblastoid Cell Lines

Description

Benefiting from the associated ChIP-seq samples, this dataset profiles H3K27Ac levels along the whole genome for multiple human lymphoblastoid cell lines, each derived from a separate person. Specifically, a set of genomic intervals of around the same size (2 kb) has been systematically selected to thoroughly cover the part of the genome that is enriched with reads in at least one of the ChIP-seq samples. And for each of these intervals, this dataset records its raw read count and enrichment status in each of the samples.

Usage

H3K27Ac

Format

H3K27Ac is a data frame that records the features of 73,828 non-overlapping genomic intervals regarding the H3K27Ac ChIP-seq signals in multiple human lymphoblastoid cell lines. It contains the following variables:

chrom, start, end: Genomic coordinate of each interval. Note that these coordinates are 0-based and correspond to the hg19 genome assembly.
cellLine_H3K27Ac_num.read_cnt: Each variable whose name is of this form records the number of reads from a ChIP-seq sample that fall within each genomic interval. For example, GM12891_H3K27Ac_2.read_cnt corresponds to the 2nd biological replicate of a ChIP-seq experiment that targets H3K27Ac in a cell line named GM12891.
cellLine_H3K27Ac_num.occupancy: Each variable whose name is of this form records the enrichment status of each genomic interval in a ChIP-seq sample. An enrichment status of 1 indicates that the interval is enriched with reads in the sample; an enrichment status of 0 indicates otherwise. In practice, enrichment status of a genomic interval in a certain ChIP-seq sample could be determined by its overlap with the peaks (see "References" below) of the sample. Note also that variables of this class correspond to the variables of raw read counts one by one.

Each cell line derives from a separate individual of the Caucasian population. Use attr(H3K27Ac, "metaInfo") to get a data frame that records meta information about the involved individuals.

Source

Raw sequencing data were obtained from Kasowski et al., 2013 (see "References" below). Adapters and low-sequencing-quality bases were trimmed from 3' ends of reads using trim_galore. The resulting reads were then aligned to the hg19 reference genome by bowtie. MACS was utilized to call peaks for each ChIP-seq sample.

Finally, MAnorm2_utils was exploited to integrate the alignment results as well as peaks of ChIP-seq samples into this regular table. MAnorm2_utils is specifically designed to create input tables of MAnorm2. See the home page of MAnorm2_utils for more information about it. It has also been uploaded to the PyPI repository as a Python package.

References

Zhang, Y., et al., Model-based analysis of ChIP-Seq (MACS). Genome Biol, 2008. 9(9): p. R137.

Kasowski, M., et al., Extensive variation in chromatin states across humans. Science, 2013. 342(6159): p. 750-2.

[Package MAnorm2 version 1.2.2 Index]