TherMos

Estimating protein-DNA binding energies from in vivo binding profiles

Wenjie Sun, Xiaoming Hu, Michael H K Lim, Calista K L Ng, Siew Hua Choo, Diogo S. Castro, Daniela Drechsel, François Guillemot, Prasanna Kolatkar, Ralf Jauch, Shyam Prabhakar

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Accurately characterizing transcription factor (TF)-DNA affinity is a central goal of regulatory genomics. Although thermodynamics provides the most natural language for describing the continuous range of TF-DNA affinity, traditional motif discovery algorithms focus instead on classification paradigms that aim to discriminate 'bound' and 'unbound' sequences. Moreover, these algorithms do not directly model the distribution of tags in ChIP-seq data. Here, we present a new algorithm named Thermodynamic Modeling of ChIP-seq (TherMos), which directly estimates a positionspecific binding energy matrix (PSEM) from ChIPseq/exo tag profiles. In cross-validation tests on seven genome-wide TF-DNA binding profiles, one of which we generated via ChIP-seq on a complex developing tissue, TherMos predicted quantitative TF-DNA binding with greater accuracy than five well-known algorithms. We experimentally validated TherMos binding energy models for Klf4 and Esrrb, using a novel protocol to measure PSEMs in vitro. Strikingly, our measurements revealed strong nonadditivity at multiple positions within the two PSEMs. Among the algorithms tested, only TherMos was able to model the entire binding energy landscape of Klf4 and Esrrb. Our study reveals new insights into the energetics of TF-DNA binding in vivo and provides an accurate first-principles approach to binding energy inference from ChIP-seq and ChIP-exo data.

Original languageEnglish
Pages (from-to)5555-5568
Number of pages14
JournalNucleic Acids Research
Volume41
Issue number11
DOIs
Publication statusPublished - 1 Jun 2013
Externally publishedYes

Fingerprint

DNA-Binding Proteins
Thermodynamics
Transcription Factors
DNA
Genomics
Language
Genome

ASJC Scopus subject areas

  • Genetics

Cite this

Sun, W., Hu, X., Lim, M. H. K., Ng, C. K. L., Choo, S. H., Castro, D. S., ... Prabhakar, S. (2013). TherMos: Estimating protein-DNA binding energies from in vivo binding profiles. Nucleic Acids Research, 41(11), 5555-5568. https://doi.org/10.1093/nar/gkt250

TherMos : Estimating protein-DNA binding energies from in vivo binding profiles. / Sun, Wenjie; Hu, Xiaoming; Lim, Michael H K; Ng, Calista K L; Choo, Siew Hua; Castro, Diogo S.; Drechsel, Daniela; Guillemot, François; Kolatkar, Prasanna; Jauch, Ralf; Prabhakar, Shyam.

In: Nucleic Acids Research, Vol. 41, No. 11, 01.06.2013, p. 5555-5568.

Research output: Contribution to journalArticle

Sun, W, Hu, X, Lim, MHK, Ng, CKL, Choo, SH, Castro, DS, Drechsel, D, Guillemot, F, Kolatkar, P, Jauch, R & Prabhakar, S 2013, 'TherMos: Estimating protein-DNA binding energies from in vivo binding profiles', Nucleic Acids Research, vol. 41, no. 11, pp. 5555-5568. https://doi.org/10.1093/nar/gkt250
Sun, Wenjie ; Hu, Xiaoming ; Lim, Michael H K ; Ng, Calista K L ; Choo, Siew Hua ; Castro, Diogo S. ; Drechsel, Daniela ; Guillemot, François ; Kolatkar, Prasanna ; Jauch, Ralf ; Prabhakar, Shyam. / TherMos : Estimating protein-DNA binding energies from in vivo binding profiles. In: Nucleic Acids Research. 2013 ; Vol. 41, No. 11. pp. 5555-5568.
@article{2eb3398d7dcd40689340b3ba03ebe2c1,
title = "TherMos: Estimating protein-DNA binding energies from in vivo binding profiles",
abstract = "Accurately characterizing transcription factor (TF)-DNA affinity is a central goal of regulatory genomics. Although thermodynamics provides the most natural language for describing the continuous range of TF-DNA affinity, traditional motif discovery algorithms focus instead on classification paradigms that aim to discriminate 'bound' and 'unbound' sequences. Moreover, these algorithms do not directly model the distribution of tags in ChIP-seq data. Here, we present a new algorithm named Thermodynamic Modeling of ChIP-seq (TherMos), which directly estimates a positionspecific binding energy matrix (PSEM) from ChIPseq/exo tag profiles. In cross-validation tests on seven genome-wide TF-DNA binding profiles, one of which we generated via ChIP-seq on a complex developing tissue, TherMos predicted quantitative TF-DNA binding with greater accuracy than five well-known algorithms. We experimentally validated TherMos binding energy models for Klf4 and Esrrb, using a novel protocol to measure PSEMs in vitro. Strikingly, our measurements revealed strong nonadditivity at multiple positions within the two PSEMs. Among the algorithms tested, only TherMos was able to model the entire binding energy landscape of Klf4 and Esrrb. Our study reveals new insights into the energetics of TF-DNA binding in vivo and provides an accurate first-principles approach to binding energy inference from ChIP-seq and ChIP-exo data.",
author = "Wenjie Sun and Xiaoming Hu and Lim, {Michael H K} and Ng, {Calista K L} and Choo, {Siew Hua} and Castro, {Diogo S.} and Daniela Drechsel and Fran{\cc}ois Guillemot and Prasanna Kolatkar and Ralf Jauch and Shyam Prabhakar",
year = "2013",
month = "6",
day = "1",
doi = "10.1093/nar/gkt250",
language = "English",
volume = "41",
pages = "5555--5568",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - TherMos

T2 - Estimating protein-DNA binding energies from in vivo binding profiles

AU - Sun, Wenjie

AU - Hu, Xiaoming

AU - Lim, Michael H K

AU - Ng, Calista K L

AU - Choo, Siew Hua

AU - Castro, Diogo S.

AU - Drechsel, Daniela

AU - Guillemot, François

AU - Kolatkar, Prasanna

AU - Jauch, Ralf

AU - Prabhakar, Shyam

PY - 2013/6/1

Y1 - 2013/6/1

N2 - Accurately characterizing transcription factor (TF)-DNA affinity is a central goal of regulatory genomics. Although thermodynamics provides the most natural language for describing the continuous range of TF-DNA affinity, traditional motif discovery algorithms focus instead on classification paradigms that aim to discriminate 'bound' and 'unbound' sequences. Moreover, these algorithms do not directly model the distribution of tags in ChIP-seq data. Here, we present a new algorithm named Thermodynamic Modeling of ChIP-seq (TherMos), which directly estimates a positionspecific binding energy matrix (PSEM) from ChIPseq/exo tag profiles. In cross-validation tests on seven genome-wide TF-DNA binding profiles, one of which we generated via ChIP-seq on a complex developing tissue, TherMos predicted quantitative TF-DNA binding with greater accuracy than five well-known algorithms. We experimentally validated TherMos binding energy models for Klf4 and Esrrb, using a novel protocol to measure PSEMs in vitro. Strikingly, our measurements revealed strong nonadditivity at multiple positions within the two PSEMs. Among the algorithms tested, only TherMos was able to model the entire binding energy landscape of Klf4 and Esrrb. Our study reveals new insights into the energetics of TF-DNA binding in vivo and provides an accurate first-principles approach to binding energy inference from ChIP-seq and ChIP-exo data.

AB - Accurately characterizing transcription factor (TF)-DNA affinity is a central goal of regulatory genomics. Although thermodynamics provides the most natural language for describing the continuous range of TF-DNA affinity, traditional motif discovery algorithms focus instead on classification paradigms that aim to discriminate 'bound' and 'unbound' sequences. Moreover, these algorithms do not directly model the distribution of tags in ChIP-seq data. Here, we present a new algorithm named Thermodynamic Modeling of ChIP-seq (TherMos), which directly estimates a positionspecific binding energy matrix (PSEM) from ChIPseq/exo tag profiles. In cross-validation tests on seven genome-wide TF-DNA binding profiles, one of which we generated via ChIP-seq on a complex developing tissue, TherMos predicted quantitative TF-DNA binding with greater accuracy than five well-known algorithms. We experimentally validated TherMos binding energy models for Klf4 and Esrrb, using a novel protocol to measure PSEMs in vitro. Strikingly, our measurements revealed strong nonadditivity at multiple positions within the two PSEMs. Among the algorithms tested, only TherMos was able to model the entire binding energy landscape of Klf4 and Esrrb. Our study reveals new insights into the energetics of TF-DNA binding in vivo and provides an accurate first-principles approach to binding energy inference from ChIP-seq and ChIP-exo data.

UR - http://www.scopus.com/inward/record.url?scp=84878882810&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878882810&partnerID=8YFLogxK

U2 - 10.1093/nar/gkt250

DO - 10.1093/nar/gkt250

M3 - Article

VL - 41

SP - 5555

EP - 5568

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 11

ER -