Time-stamp: 2013-03-23 12:32:12 Hongbo Liu
Epigenetic modifications play critical roles in the regulation of gene expression and chromatin remodeling. Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) has been widely used for genome-wide profiling of chromatin modifications and DNA-binding proteins. The unprecedented scale and precision of epigenomic data have enabled the quantitative analysis of differential epigenetic status in gene regulation in various biological processes. To address the lack of powerful ChIP-Seq analysis method, we present a novel algorithm, named Quantitative Differential Chromatin Modification Region (QDCMR). QDCMR provides a quantitative approach to quantify chromatin modification difference and identify DCMRs from genome-wide chromatin modification profiles by adapting Shannon entropy. Its platform-free and species-free nature makes it easy for computational biologists to analysis the epigenetic regulation related with DCMRs across various temporal and spatial chromatin modifications.
Currently, QDCMR can be executed with visual interface (QDCMRforView) and in command line (QDCMRforCommand). As of Release 1.0, QDCMR , whether QDCMRforView or QDCMRforCommand, can be configured to run under Java 1.6 or higher. Please make sure that you have a recent version of Java installed. If QDCMR can't work, you can try to install a recent version of Java runtime first.
The QDCMR with visual interface facilities users to handle ChIP-Seq data. This version of QDCMR provides users the following functions:
1. Import Data: Preprocess and import chromatin modification data.
2. Quantify Difference: Quantify chromatin modification difference across various samples.
3. Identify DCMRs: Identify DCMRs by threshold imbedded in QDCMR.
4. Measure Specificity: Measure sample-specificity for each DCMRs.
5. Export Results: Save results.
6. Visualization: Display chromatin modification level, DCMR distribution and UCSC links.
Download the compressed package named as "QDCMRforView.rar", and decompress it. For windows please Double Click run StartWin.bat file For linux please run StartLinux.sh file The file named as "Tutorial.html" embedded in QDCMRforView.rar provides more detailed information about QDCMRforView.
The QDCMR in command line facilities users to handle huge ChIP-Seq data in server automatically. This version of QDCMR can process the data automatically according to user's command.
Download the compressed package named as "QDCMRforCommand.rar", and decompress it.
For the processed data
java -jar -Xmx1024m QDCMR.jar -P infile=example.gct,ResultFolder=Result,SD=0.07
Example command
java -jar -Xmx1024m QDCMR.jar -P infile=/pub2/hbliu/QDCMR/DATA/example.gct,ResultFolder=/pub2/hbliu/QDCMR/DATA/Result,SD=0.07
- -Xmx1024m: Use maximum memory for 1024M.
- -P: For processed data.
- infile: The file path for the import data including the regions with chromatin modification data across multiple samples (File format is gct).
- ResultFolder: The folder path for the export of analysis results of QDCMR.
- SD: Standard Deviation--the standard deviation of probability model for DCMR threshold.
For the region and raw chromatin modification data
java -jar -Xmx1024m QDCMR.jar -R RegionFile=example.bed,ReadFolder=ReadFile,ResultFolder=Result,SD=0.07,ExpandLength=200,BinSize=1,DepthNormalization=NO
Example command
java -jar -Xmx1024m QDCMR.jar -R RegionFile=/pub2/hbliu/QDCMR/DATA/example.bed,ReadFolder=/pub2/hbliu/QDCMR/DATA/ReadFile,ResultFolder=/pub2/hbliu/QDCMR/DATA/Result,SD=0.07,ExpandLength=200,BinSize=1,DepthNormalization=NO
There are seven major functions available in QDCMR serving as sub-commands.
- -Xmx1024m: Use maximum memory for 1024M.
- -R: For region and raw chromatin modification data.
- RegionFile: The file path for the import region data (File format is bed).
- ReadFolder: The folder path including the raw read files of Chromatin Modifications by ChIP-Seq (Read file format for is bed).
- ResultFolder: The folder path for the export of analysis results of QDCMR.
- SD: Standard Deviation--the standard deviation of probability model for DCMR threshold.
- Expand Length: Applicable to the single-terminal sequencing data. User can select or input the suitable length of sequence represented by each read. Each read is expanded to the length firstly, and then mapped to corresponding region.
- Bin Size: The unit of region segment. The default value is 1 bp meaning the total read number in each region is normalized by the region length.
- Depth Normalization: (YES or NO,YES--In Depth Normalization, NO--Don't Depth Normalization)Considering the different sequencing depth of ChIP-Seq files, the read number is further normalized by the total read number of the given ChiP-Seq file relative to the mean of the total read numbers of the all used ChiP-Seq files.
Tab delimited format (tabular)
Does not require header line
Contains 6 required fields:
chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or contig (e.g. ctgY1).
chromStart - The starting position of the feature in the chromosome or contig. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or contig. The chromEnd base is not included in the display of the
description1 - The description for the region, such as region id, gene name et al.
description2 - The description for the region, such as region id, gene name et al.
strand - Defines the strand - either '+' or '-'.
bed format online: http://genome.ucsc.edu/FAQ/FAQformat.html#format1
Add two header rows at the top of the file:
In the first row, first cell, enter: #1.2
In the second row, first cell, enter the number of data rows: N
In the second row, second cell, enter the number of data columns: M
In the third row is header line
In the next is data, Former two columns is ID and description, and later is data matrix to N*M
gct format online:http://www.broadinstitute.org/cancer/software/genepattern/gp_guides/file-formats#_Creating_Input_Files_GCT
QDMR, a tool for identification and analysis of differentially methylated regions: http://bioinfo.hrbmu.edu.cn/qdmr/
GCER, Our group http://202.97.205.78/gcer/
MACS, a tool for ChIP-chip/seq analysis: https://github.com/taoliu/MACS
bedTools, a super useful toolkits for genome annotation files: http://code.google.com/p/bedtools/
Please cite following paper if you used QDCMR or related information:
Liu, H., Chen, Y., Lv, J., Zhu, R., Su, J., Liu, X., Zhang, Y. and Wu, Q. (2013) Quantitative epigenetic co-variation in CpG islands and co-regulation of developmental genes. Scientific reports, 3, 2576.