Mining microarray expression data by literature profiling.

Research output: Contribution to journalArticle

110 Citations (Scopus)

Abstract

BACKGROUND: The rapidly expanding fields of genomics and proteomics have prompted the development of computational methods for managing, analyzing and visualizing expression data derived from microarray screening. Nevertheless, the lack of efficient techniques for assessing the biological implications of gene-expression data remains an important obstacle in exploiting this information. RESULTS: To address this need, we have developed a mining technique based on the analysis of literature profiles generated by extracting the frequencies of certain terms from thousands of abstracts stored in the Medline literature database. Terms are then filtered on the basis of both repetitive occurrence and co-occurrence among multiple gene entries. Finally, clustering analysis is performed on the retained frequency values, shaping a coherent picture of the functional relationship among large and heterogeneous lists of genes. Such data treatment also provides information on the nature and pertinence of the associations that were formed. CONCLUSIONS: The analysis of patterns of term occurrence in abstracts constitutes a means of exploring the biological significance of large and heterogeneous lists of genes. This approach should contribute to optimizing the exploitation of microarray technologies by providing investigators with an interface between complex expression data and large literature resources.

Original languageEnglish
JournalGenome Biology
Volume3
Issue number10
Publication statusPublished - 13 Sep 2002
Externally publishedYes

Fingerprint

Genes
gene
microarray technology
genes
Genomics
Proteomics
proteomics
Cluster Analysis
methodology
Research Personnel
Databases
screening
gene expression
Technology
genomics
Gene Expression
resource
analysis
need
method

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

Mining microarray expression data by literature profiling. / Chaussabel, Damien J.; Sher, Alan.

In: Genome Biology, Vol. 3, No. 10, 13.09.2002.

Research output: Contribution to journalArticle

@article{844f873b34a749eeae699348d232ae48,
title = "Mining microarray expression data by literature profiling.",
abstract = "BACKGROUND: The rapidly expanding fields of genomics and proteomics have prompted the development of computational methods for managing, analyzing and visualizing expression data derived from microarray screening. Nevertheless, the lack of efficient techniques for assessing the biological implications of gene-expression data remains an important obstacle in exploiting this information. RESULTS: To address this need, we have developed a mining technique based on the analysis of literature profiles generated by extracting the frequencies of certain terms from thousands of abstracts stored in the Medline literature database. Terms are then filtered on the basis of both repetitive occurrence and co-occurrence among multiple gene entries. Finally, clustering analysis is performed on the retained frequency values, shaping a coherent picture of the functional relationship among large and heterogeneous lists of genes. Such data treatment also provides information on the nature and pertinence of the associations that were formed. CONCLUSIONS: The analysis of patterns of term occurrence in abstracts constitutes a means of exploring the biological significance of large and heterogeneous lists of genes. This approach should contribute to optimizing the exploitation of microarray technologies by providing investigators with an interface between complex expression data and large literature resources.",
author = "Chaussabel, {Damien J.} and Alan Sher",
year = "2002",
month = "9",
day = "13",
language = "English",
volume = "3",
journal = "Genome Biology",
issn = "1474-7596",
publisher = "BioMed Central",
number = "10",

}

TY - JOUR

T1 - Mining microarray expression data by literature profiling.

AU - Chaussabel, Damien J.

AU - Sher, Alan

PY - 2002/9/13

Y1 - 2002/9/13

N2 - BACKGROUND: The rapidly expanding fields of genomics and proteomics have prompted the development of computational methods for managing, analyzing and visualizing expression data derived from microarray screening. Nevertheless, the lack of efficient techniques for assessing the biological implications of gene-expression data remains an important obstacle in exploiting this information. RESULTS: To address this need, we have developed a mining technique based on the analysis of literature profiles generated by extracting the frequencies of certain terms from thousands of abstracts stored in the Medline literature database. Terms are then filtered on the basis of both repetitive occurrence and co-occurrence among multiple gene entries. Finally, clustering analysis is performed on the retained frequency values, shaping a coherent picture of the functional relationship among large and heterogeneous lists of genes. Such data treatment also provides information on the nature and pertinence of the associations that were formed. CONCLUSIONS: The analysis of patterns of term occurrence in abstracts constitutes a means of exploring the biological significance of large and heterogeneous lists of genes. This approach should contribute to optimizing the exploitation of microarray technologies by providing investigators with an interface between complex expression data and large literature resources.

AB - BACKGROUND: The rapidly expanding fields of genomics and proteomics have prompted the development of computational methods for managing, analyzing and visualizing expression data derived from microarray screening. Nevertheless, the lack of efficient techniques for assessing the biological implications of gene-expression data remains an important obstacle in exploiting this information. RESULTS: To address this need, we have developed a mining technique based on the analysis of literature profiles generated by extracting the frequencies of certain terms from thousands of abstracts stored in the Medline literature database. Terms are then filtered on the basis of both repetitive occurrence and co-occurrence among multiple gene entries. Finally, clustering analysis is performed on the retained frequency values, shaping a coherent picture of the functional relationship among large and heterogeneous lists of genes. Such data treatment also provides information on the nature and pertinence of the associations that were formed. CONCLUSIONS: The analysis of patterns of term occurrence in abstracts constitutes a means of exploring the biological significance of large and heterogeneous lists of genes. This approach should contribute to optimizing the exploitation of microarray technologies by providing investigators with an interface between complex expression data and large literature resources.

UR - http://www.scopus.com/inward/record.url?scp=0038017587&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038017587&partnerID=8YFLogxK

M3 - Article

VL - 3

JO - Genome Biology

JF - Genome Biology

SN - 1474-7596

IS - 10

ER -