Thursday, November 15, 2012

Get the most of your ChIP-seq experiments

Get the most of your ChIP-Seq experiments

Posted by
|
ChIP-seq is now widely used to profile the enrichment of a DNA-binding protein on a genome. It is a mature computational field and there are a lot of tools out there to get the most of the row data generated depending on the biological context and the study. It is of high interest to compare the binding differences of a histone mark or transcription factor between two contrasting conditions, such as disease vs. control, or let’s call them to be enough general Condition 1 vs Condition 2.
Getting to know the differentially enriched regions might help understanding the biology of Transcription Factor binding, which is a field that, despite all studies in the literature, has not yet revealed all its secrets. I came through a lot of publications lately on the nature of binding itself, some papers dealing with the binding site structure and characteristics, to make the most of your ChIP-Seq experiments it is crucial to relate this binding in case of stimulus caused by a disease or a genomic alteration, the real question though is whatever change we can observe on the data, is this change the cause of a phenotype, or a consequence of the real alteration that might not be captured by ChIP-Seq experiments.
I was trying to compare some differentially distributed ChIP experiment on the Human genome, and thought it might be useful to anyone interested or doing the same thing, to have a list of tools that help achieving such process, this is not a complete list but you might add your favorite method, package or software to make the list more interesting.

diffreps : Finding differential chromatin modification sites from ChIP-seq data

“diffReps is developed to serve this purpose. It scans the whole genome using a sliding window, performing millions of statistical tests and report the significant hits. diffReps takes into account the biological variations within a group of samples and uses that information to enhance the statistical power. Considering biological variation is of high importance, especiallly for in vivo brain tissues”
Websitehttp://code.google.com/p/diffreps/

DiffBind : differential binding analysis of ChIP-Seq peak data

“DiffBind works primarily with peaksets, which are sets of genomic intervals representing candidate protein binding sites. Each interval consists of a chromosome, a start and end position, and usually a score of some type indicating confidence in, or strength of, the peak. Associated with each peakset are metadata relating to the experiment from which the peakset was derived. Additionally, files containing mapped sequencing reads (BAM//BED) can be associated with each peakset (one for the ChIP data, and optionally another representing a control dataset). ”
Websitehttp://www.bioconductor.org/packages/release/bioc/html/DiffBind.html

Rcade : R-based analysis of ChIP-seq And Differential Expression data

“Rcade is a tool that analyses ChIP-seq data and couples the results to an existing Differential Expression (DE) analysis. A key application of Rcade is in inferring the direct targets of a transcription factor (TF) – these targets should exhibit TF binding activity, and their expression levels should change in response to a perturbation of the TF. ”
Websitehttp://www.bioconductor.org/packages/release/bioc/html/Rcade.html

MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets

“MAnorm…for quantitative comparison of ChIP-Seq data sets describing transcription factor binding sites and epigenetic modifications. The quantitative binding differences inferred by MAnorm showed strong correlation with both the changes in expression of target genes and the binding of cell type-specific regulators.”
Websitehttp://bcb.dfci.harvard.edu/~gcyuan/MAnorm/MAnorm.htm
Publicationhttp://bcb.dfci.harvard.edu/~gcyuan/mypaper/shao;%20manorm.pdf

ChIPnorm: A Statistical Method for Normalizing and Identifying Differential Regions in Histone Modification ChIP-seq Libraries

“ChIPnorm method removes most of the noise and bias in the data and outperforms other normalization methods”
Websitehttp://lcbb.epfl.ch/software.html
Publicationhttp://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0039573

DESeq

“DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression”
Websitehttp://www-huber.embl.de/users/anders/DESeq/

DIME: R-package for Identifying Differential ChIP-seq Based on an Ensemble of Mixture Models.

“R package that considers an ensemble of finite mixture models combined with a local false discovery rate (fdr) for analyzing ChIP-seq data comparing two samples. This package can also be used to identify differential in other high throughput data such as microarray and DNA methylation.”
Websitehttp://www.stat.osu.edu/~statgen/SOFTWARE/DIME/

EdgeR :

“edgeR is an R/Bioconductor software package for statistical analysis of replicated count data. Methods are designed for assessing differential expression in comparative RNA-Seq experiments, but are generally applicable to count data from other genome-scale platforms (ChIP-Seq, MeDIP-Seq, Tag-Seq, SAGE-Seq etc”
Websitehttp://www.bioconductor.org/packages/release/bioc/html/edgeR.html

1 comment:

  1. MACS: Model-based Analysis for ChIP-Seq

    http://liulab.dfci.harvard.edu/MACS/

    Next generation parallel sequencing technologies made chromatin immunoprecipitation followed by sequencing (ChIP-Seq) a popular strategy to study genome-wide protein-DNA interactions, while creating challenges for analysis algorithms. We present Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, is publicly available open source, and can be used for ChIP-Seq with or without control samples.

    Now, the newest version is version 1.4.2

    ReplyDelete