More robust detection of motifs in coexpressed genes by using phylogenetic information

Pieter Monsieurs, Gert Thijs, Abeer A. Fadda, Sigrid C J De Keersmaecker, Jozef Vanderleyden, Bart De Moor, Kathleen Marchal

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Background: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. Results: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. Conclusion: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.

Original languageEnglish
Article number160
JournalBMC Bioinformatics
Volume7
DOIs
Publication statusPublished - 20 Mar 2006
Externally publishedYes

Fingerprint

Phylogenetics
Genes
Gene
Intergenic DNA
Methodology
Regulatory Networks
Nucleic Acid Regulatory Sequences
Signal-To-Noise Ratio
Screening
Cluster Analysis
Consecutive
Genome
Clustering
Signal to noise ratio
Robustness
Datasets

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

Monsieurs, P., Thijs, G., Fadda, A. A., De Keersmaecker, S. C. J., Vanderleyden, J., De Moor, B., & Marchal, K. (2006). More robust detection of motifs in coexpressed genes by using phylogenetic information. BMC Bioinformatics, 7, [160]. https://doi.org/10.1186/1471-2105-7-160

More robust detection of motifs in coexpressed genes by using phylogenetic information. / Monsieurs, Pieter; Thijs, Gert; Fadda, Abeer A.; De Keersmaecker, Sigrid C J; Vanderleyden, Jozef; De Moor, Bart; Marchal, Kathleen.

In: BMC Bioinformatics, Vol. 7, 160, 20.03.2006.

Research output: Contribution to journalArticle

Monsieurs, P, Thijs, G, Fadda, AA, De Keersmaecker, SCJ, Vanderleyden, J, De Moor, B & Marchal, K 2006, 'More robust detection of motifs in coexpressed genes by using phylogenetic information', BMC Bioinformatics, vol. 7, 160. https://doi.org/10.1186/1471-2105-7-160
Monsieurs, Pieter ; Thijs, Gert ; Fadda, Abeer A. ; De Keersmaecker, Sigrid C J ; Vanderleyden, Jozef ; De Moor, Bart ; Marchal, Kathleen. / More robust detection of motifs in coexpressed genes by using phylogenetic information. In: BMC Bioinformatics. 2006 ; Vol. 7.
@article{1fa1c9ed950e4f20a9c5c75ff9eb326f,
title = "More robust detection of motifs in coexpressed genes by using phylogenetic information",
abstract = "Background: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. Results: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. Conclusion: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.",
author = "Pieter Monsieurs and Gert Thijs and Fadda, {Abeer A.} and {De Keersmaecker}, {Sigrid C J} and Jozef Vanderleyden and {De Moor}, Bart and Kathleen Marchal",
year = "2006",
month = "3",
day = "20",
doi = "10.1186/1471-2105-7-160",
language = "English",
volume = "7",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - More robust detection of motifs in coexpressed genes by using phylogenetic information

AU - Monsieurs, Pieter

AU - Thijs, Gert

AU - Fadda, Abeer A.

AU - De Keersmaecker, Sigrid C J

AU - Vanderleyden, Jozef

AU - De Moor, Bart

AU - Marchal, Kathleen

PY - 2006/3/20

Y1 - 2006/3/20

N2 - Background: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. Results: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. Conclusion: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.

AB - Background: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. Results: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. Conclusion: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.

UR - http://www.scopus.com/inward/record.url?scp=33746599442&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33746599442&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-7-160

DO - 10.1186/1471-2105-7-160

M3 - Article

VL - 7

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 160

ER -