Labeling negative examples in supervised learning of new gene regulatory connections

Luigi Cerulo, Vincenzo Paduano, Pietro Zoppoli, Michele Ceccarelli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. The basic approach consists into building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases or known in literature. Such a classifier is then used to predict new unknown connections. The quality of the training set plays a crucial role in such an inference scheme. In binary classification the training set should be composed of positive and negative examples, but in Biology literature the only collected information is whether two genes interact. Instead, the counterpart information is usually not reported, as Biologists are not aware to state whether two genes are not interacting. The over presence of topology motifs in currently known gene regulatory networks, such as, feed-forward loops, bi-fan clusters, and single input modules, could drive the selection of reliable negative examples. We introduce, discuss, and evaluate a number of negative selection heuristics that exploits the known gene network topology of Escherichia coli and Saccharomyces cerevisiae.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages159-173
Number of pages15
Volume6685 LNBI
DOIs
Publication statusPublished - 19 Aug 2011
Externally publishedYes
Event7th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2010 - Palermo, Italy
Duration: 16 Sep 201018 Sep 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6685 LNBI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other7th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2010
CountryItaly
CityPalermo
Period16/9/1018/9/10

Fingerprint

Supervised learning
Supervised Learning
Labeling
Genes
Gene Regulatory Network
Gene
Classifier
Negative Selection
Binary Classification
Gene Networks
Saccharomyces Cerevisiae
Feedforward
Gene Expression Data
Feature Vector
Classifiers
Network Topology
Escherichia Coli
Biology
Topology
Heuristics

Keywords

  • positive only
  • reverse engineering gene regulatory networks
  • supervised learning

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Cerulo, L., Paduano, V., Zoppoli, P., & Ceccarelli, M. (2011). Labeling negative examples in supervised learning of new gene regulatory connections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6685 LNBI, pp. 159-173). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6685 LNBI). https://doi.org/10.1007/978-3-642-21946-7_13

Labeling negative examples in supervised learning of new gene regulatory connections. / Cerulo, Luigi; Paduano, Vincenzo; Zoppoli, Pietro; Ceccarelli, Michele.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6685 LNBI 2011. p. 159-173 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6685 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cerulo, L, Paduano, V, Zoppoli, P & Ceccarelli, M 2011, Labeling negative examples in supervised learning of new gene regulatory connections. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6685 LNBI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6685 LNBI, pp. 159-173, 7th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2010, Palermo, Italy, 16/9/10. https://doi.org/10.1007/978-3-642-21946-7_13
Cerulo L, Paduano V, Zoppoli P, Ceccarelli M. Labeling negative examples in supervised learning of new gene regulatory connections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6685 LNBI. 2011. p. 159-173. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-21946-7_13
Cerulo, Luigi ; Paduano, Vincenzo ; Zoppoli, Pietro ; Ceccarelli, Michele. / Labeling negative examples in supervised learning of new gene regulatory connections. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6685 LNBI 2011. pp. 159-173 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{1d36cdb2b8974d298ad3c9d933a1db44,
title = "Labeling negative examples in supervised learning of new gene regulatory connections",
abstract = "Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. The basic approach consists into building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases or known in literature. Such a classifier is then used to predict new unknown connections. The quality of the training set plays a crucial role in such an inference scheme. In binary classification the training set should be composed of positive and negative examples, but in Biology literature the only collected information is whether two genes interact. Instead, the counterpart information is usually not reported, as Biologists are not aware to state whether two genes are not interacting. The over presence of topology motifs in currently known gene regulatory networks, such as, feed-forward loops, bi-fan clusters, and single input modules, could drive the selection of reliable negative examples. We introduce, discuss, and evaluate a number of negative selection heuristics that exploits the known gene network topology of Escherichia coli and Saccharomyces cerevisiae.",
keywords = "positive only, reverse engineering gene regulatory networks, supervised learning",
author = "Luigi Cerulo and Vincenzo Paduano and Pietro Zoppoli and Michele Ceccarelli",
year = "2011",
month = "8",
day = "19",
doi = "10.1007/978-3-642-21946-7_13",
language = "English",
isbn = "9783642219450",
volume = "6685 LNBI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "159--173",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Labeling negative examples in supervised learning of new gene regulatory connections

AU - Cerulo, Luigi

AU - Paduano, Vincenzo

AU - Zoppoli, Pietro

AU - Ceccarelli, Michele

PY - 2011/8/19

Y1 - 2011/8/19

N2 - Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. The basic approach consists into building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases or known in literature. Such a classifier is then used to predict new unknown connections. The quality of the training set plays a crucial role in such an inference scheme. In binary classification the training set should be composed of positive and negative examples, but in Biology literature the only collected information is whether two genes interact. Instead, the counterpart information is usually not reported, as Biologists are not aware to state whether two genes are not interacting. The over presence of topology motifs in currently known gene regulatory networks, such as, feed-forward loops, bi-fan clusters, and single input modules, could drive the selection of reliable negative examples. We introduce, discuss, and evaluate a number of negative selection heuristics that exploits the known gene network topology of Escherichia coli and Saccharomyces cerevisiae.

AB - Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. The basic approach consists into building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases or known in literature. Such a classifier is then used to predict new unknown connections. The quality of the training set plays a crucial role in such an inference scheme. In binary classification the training set should be composed of positive and negative examples, but in Biology literature the only collected information is whether two genes interact. Instead, the counterpart information is usually not reported, as Biologists are not aware to state whether two genes are not interacting. The over presence of topology motifs in currently known gene regulatory networks, such as, feed-forward loops, bi-fan clusters, and single input modules, could drive the selection of reliable negative examples. We introduce, discuss, and evaluate a number of negative selection heuristics that exploits the known gene network topology of Escherichia coli and Saccharomyces cerevisiae.

KW - positive only

KW - reverse engineering gene regulatory networks

KW - supervised learning

UR - http://www.scopus.com/inward/record.url?scp=80051689293&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051689293&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-21946-7_13

DO - 10.1007/978-3-642-21946-7_13

M3 - Conference contribution

AN - SCOPUS:80051689293

SN - 9783642219450

VL - 6685 LNBI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 159

EP - 173

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -