Get the most of your ChIP-Seq experiments
Posted by Radhouane Aniba
November 7, 2012
Getting to know the differentially enriched regions might help understanding the biology of Transcription Factor binding, which is a field that, despite all studies in the literature, has not yet revealed all its secrets. I came through a lot of publications lately on the nature of binding itself, some papers dealing with the binding site structure and characteristics, to make the most of your ChIP-Seq experiments it is crucial to relate this binding in case of stimulus caused by a disease or a genomic alteration, the real question though is whatever change we can observe on the data, is this change the cause of a phenotype, or a consequence of the real alteration that might not be captured by ChIP-Seq experiments.
I was trying to compare some differentially distributed ChIP experiment on the Human genome, and thought it might be useful to anyone interested or doing the same thing, to have a list of tools that help achieving such process, this is not a complete list but you might add your favorite method, package or software to make the list more interesting.
diffreps : Finding differential chromatin modification sites from ChIP-seq data
“diffReps is developed to serve this purpose. It scans the whole
genome using a sliding window, performing millions of statistical tests
and report the significant hits. diffReps takes into account the
biological variations within a group of samples and uses that
information to enhance the statistical power. Considering biological
variation is of high importance, especiallly for in vivo brain tissues”Website : http://code.google.com/p/diffreps/
DiffBind : differential binding analysis of ChIP-Seq peak data
“DiffBind works primarily with peaksets, which are sets of genomic
intervals representing candidate protein binding sites. Each interval
consists of a chromosome, a start and end position, and usually a score
of some type indicating confidence in, or strength of, the peak.
Associated with each peakset are metadata relating to the experiment
from which the peakset was derived. Additionally, files containing
mapped sequencing reads (BAM//BED) can be associated with each peakset
(one for the ChIP data, and optionally another representing a control
dataset). ”Website : http://www.bioconductor.org/packages/release/bioc/html/DiffBind.html
Rcade : R-based analysis of ChIP-seq And Differential Expression data
“Rcade is a tool that analyses ChIP-seq data and couples the results
to an existing Differential Expression (DE) analysis. A key application
of Rcade is in inferring the direct targets of a transcription factor
(TF) – these targets should exhibit TF binding activity, and their
expression levels should change in response to a perturbation of the TF.
”Website : http://www.bioconductor.org/packages/release/bioc/html/Rcade.html
MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets
“MAnorm…for quantitative comparison of ChIP-Seq data sets describing
transcription factor binding sites and epigenetic modifications. The
quantitative binding differences inferred by MAnorm showed strong
correlation with both the changes in expression of target genes and the
binding of cell type-specific regulators.”Website : http://bcb.dfci.harvard.edu/~gcyuan/MAnorm/MAnorm.htm
Publication : http://bcb.dfci.harvard.edu/~gcyuan/mypaper/shao;%20manorm.pdf
ChIPnorm: A Statistical Method for Normalizing and Identifying Differential Regions in Histone Modification ChIP-seq Libraries
“ChIPnorm method removes most of the noise and bias in the data and outperforms other normalization methods”Website : http://lcbb.epfl.ch/software.html
Publication : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0039573
DESeq
“DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression”Website : http://www-huber.embl.de/users/anders/DESeq/
DIME: R-package for Identifying Differential ChIP-seq Based on an Ensemble of Mixture Models.
“R package that considers an ensemble of finite mixture models
combined with a local false discovery rate (fdr) for analyzing ChIP-seq
data comparing two samples. This package can also be used to identify
differential in other high throughput data such as microarray and DNA
methylation.”Website : http://www.stat.osu.edu/~statgen/SOFTWARE/DIME/
EdgeR :
“edgeR is an R/Bioconductor software package for statistical analysis
of replicated count data. Methods are designed for assessing
differential expression in comparative RNA-Seq experiments, but are
generally applicable to count data from other genome-scale platforms
(ChIP-Seq, MeDIP-Seq, Tag-Seq, SAGE-Seq etc”Website : http://www.bioconductor.org/packages/release/bioc/html/edgeR.html
MACS: Model-based Analysis for ChIP-Seq
ReplyDeletehttp://liulab.dfci.harvard.edu/MACS/
Next generation parallel sequencing technologies made chromatin immunoprecipitation followed by sequencing (ChIP-Seq) a popular strategy to study genome-wide protein-DNA interactions, while creating challenges for analysis algorithms. We present Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.
Now, the newest version is version 1.4.2