Investigating the minimum required number of genes for the classification of neuromuscular disease microarray data

Argiris Sakellariou, Despina Sanoudou, George Spyrou

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

The discovery of potential microarray markers, which will expedite molecular diagnosis/prognosis and provide reliable results to clinical decision-making and treatment selection for patients, is of paramount importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most statistically significant genes, are a powerful approach toward this goal. In this paper, we investigate the minimum required subsets of genes, which best classify neuromuscular disease data. For this purpose, we implemented a methodology pipeline that facilitated the use of multiple feature selection methods and subsequent performance of data classification. Five feature selection methods on datasets from ten different neuromuscular diseases were utilized. Our findings reveal subsets of very small number of genes, which can successfully classify normal/disease samples. Interestingly, we observe that similar classification results may be obtained from different subsets of genes. The proposed methodology can expedite the identification of small gene subsets with high-classification accuracy that could ultimately be used in the genetics clinics for diagnostic, prognostic, and pharmacogenomic purposes.

Original languageEnglish
Article number5735227
Pages (from-to)349-355
Number of pages7
JournalIEEE Transactions on Information Technology in Biomedicine
Volume15
Issue number3
DOIs
Publication statusPublished - 1 May 2011
Externally publishedYes

Fingerprint

Neuromuscular Diseases
Microarrays
Genes
Feature extraction
Pharmacogenetics
Patient Selection
Pipelines
Decision making

Keywords

  • Feature selection
  • microarray data analysis
  • molecular diagnosis
  • neuromuscular diseases

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Biotechnology
  • Computer Science Applications
  • Medicine(all)

Cite this

Investigating the minimum required number of genes for the classification of neuromuscular disease microarray data. / Sakellariou, Argiris; Sanoudou, Despina; Spyrou, George.

In: IEEE Transactions on Information Technology in Biomedicine, Vol. 15, No. 3, 5735227, 01.05.2011, p. 349-355.

Research output: Contribution to journalArticle

@article{cd2ad6c4826e488f88a9a965956e7da9,
title = "Investigating the minimum required number of genes for the classification of neuromuscular disease microarray data",
abstract = "The discovery of potential microarray markers, which will expedite molecular diagnosis/prognosis and provide reliable results to clinical decision-making and treatment selection for patients, is of paramount importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most statistically significant genes, are a powerful approach toward this goal. In this paper, we investigate the minimum required subsets of genes, which best classify neuromuscular disease data. For this purpose, we implemented a methodology pipeline that facilitated the use of multiple feature selection methods and subsequent performance of data classification. Five feature selection methods on datasets from ten different neuromuscular diseases were utilized. Our findings reveal subsets of very small number of genes, which can successfully classify normal/disease samples. Interestingly, we observe that similar classification results may be obtained from different subsets of genes. The proposed methodology can expedite the identification of small gene subsets with high-classification accuracy that could ultimately be used in the genetics clinics for diagnostic, prognostic, and pharmacogenomic purposes.",
keywords = "Feature selection, microarray data analysis, molecular diagnosis, neuromuscular diseases",
author = "Argiris Sakellariou and Despina Sanoudou and George Spyrou",
year = "2011",
month = "5",
day = "1",
doi = "10.1109/TITB.2011.2130531",
language = "English",
volume = "15",
pages = "349--355",
journal = "IEEE Journal of Biomedical and Health Informatics",
issn = "2168-2194",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Investigating the minimum required number of genes for the classification of neuromuscular disease microarray data

AU - Sakellariou, Argiris

AU - Sanoudou, Despina

AU - Spyrou, George

PY - 2011/5/1

Y1 - 2011/5/1

N2 - The discovery of potential microarray markers, which will expedite molecular diagnosis/prognosis and provide reliable results to clinical decision-making and treatment selection for patients, is of paramount importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most statistically significant genes, are a powerful approach toward this goal. In this paper, we investigate the minimum required subsets of genes, which best classify neuromuscular disease data. For this purpose, we implemented a methodology pipeline that facilitated the use of multiple feature selection methods and subsequent performance of data classification. Five feature selection methods on datasets from ten different neuromuscular diseases were utilized. Our findings reveal subsets of very small number of genes, which can successfully classify normal/disease samples. Interestingly, we observe that similar classification results may be obtained from different subsets of genes. The proposed methodology can expedite the identification of small gene subsets with high-classification accuracy that could ultimately be used in the genetics clinics for diagnostic, prognostic, and pharmacogenomic purposes.

AB - The discovery of potential microarray markers, which will expedite molecular diagnosis/prognosis and provide reliable results to clinical decision-making and treatment selection for patients, is of paramount importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most statistically significant genes, are a powerful approach toward this goal. In this paper, we investigate the minimum required subsets of genes, which best classify neuromuscular disease data. For this purpose, we implemented a methodology pipeline that facilitated the use of multiple feature selection methods and subsequent performance of data classification. Five feature selection methods on datasets from ten different neuromuscular diseases were utilized. Our findings reveal subsets of very small number of genes, which can successfully classify normal/disease samples. Interestingly, we observe that similar classification results may be obtained from different subsets of genes. The proposed methodology can expedite the identification of small gene subsets with high-classification accuracy that could ultimately be used in the genetics clinics for diagnostic, prognostic, and pharmacogenomic purposes.

KW - Feature selection

KW - microarray data analysis

KW - molecular diagnosis

KW - neuromuscular diseases

UR - http://www.scopus.com/inward/record.url?scp=79955644353&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79955644353&partnerID=8YFLogxK

U2 - 10.1109/TITB.2011.2130531

DO - 10.1109/TITB.2011.2130531

M3 - Article

VL - 15

SP - 349

EP - 355

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

SN - 2168-2194

IS - 3

M1 - 5735227

ER -