A machine learning approach to mass spectra classification with unsupervised feature selection

Michele Ceccarelli, Antonio D'Acierno, Angelo Facchiano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Mass spectrometry spectra are recognized as a screening tool for detecting discriminatory protein patterns. Mass spectra, however, are high dimensional data and a large number of local maxima (a.k.a. peaks) have to be analyzed; to tackle this problem we have developed a three-step strategy. After data pre-processing we perform an unsupervised feature selection phase aimed at detecting salient parts of the spectra which could be useful for the subsequent classification phase. The main contribution of the paper is the development of this feature selection and extraction procedure grounded on the theory of multi-scale spaces. Then we use support vector machines for classification. Results obtained by the analysis of a data set of tumor/healthy samples allowed us to correctly classify more than 95% of samples. ROC analysis has been also performed.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages242-252
Number of pages11
Volume5488 LNBI
DOIs
Publication statusPublished - 28 Sep 2009
Externally publishedYes
Event5th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2008 - Vietri sul Mare, Italy
Duration: 3 Oct 20084 Oct 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5488 LNBI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other5th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2008
CountryItaly
CityVietri sul Mare
Period3/10/084/10/08

Fingerprint

Feature Selection
Learning systems
Feature extraction
Machine Learning
ROC Analysis
Data Preprocessing
Scale Space
Mass Spectrometry
High-dimensional Data
Feature Extraction
Mass spectrometry
Screening
Support vector machines
Tumors
Tumor
Support Vector Machine
Classify
Proteins
Protein
Processing

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Ceccarelli, M., D'Acierno, A., & Facchiano, A. (2009). A machine learning approach to mass spectra classification with unsupervised feature selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5488 LNBI, pp. 242-252). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5488 LNBI). https://doi.org/10.1007/978-3-642-02504-4_22

A machine learning approach to mass spectra classification with unsupervised feature selection. / Ceccarelli, Michele; D'Acierno, Antonio; Facchiano, Angelo.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5488 LNBI 2009. p. 242-252 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5488 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ceccarelli, M, D'Acierno, A & Facchiano, A 2009, A machine learning approach to mass spectra classification with unsupervised feature selection. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5488 LNBI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5488 LNBI, pp. 242-252, 5th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2008, Vietri sul Mare, Italy, 3/10/08. https://doi.org/10.1007/978-3-642-02504-4_22
Ceccarelli M, D'Acierno A, Facchiano A. A machine learning approach to mass spectra classification with unsupervised feature selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5488 LNBI. 2009. p. 242-252. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-02504-4_22
Ceccarelli, Michele ; D'Acierno, Antonio ; Facchiano, Angelo. / A machine learning approach to mass spectra classification with unsupervised feature selection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5488 LNBI 2009. pp. 242-252 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4db7d879e5754ac5ac9acbc387079f4e,
title = "A machine learning approach to mass spectra classification with unsupervised feature selection",
abstract = "Mass spectrometry spectra are recognized as a screening tool for detecting discriminatory protein patterns. Mass spectra, however, are high dimensional data and a large number of local maxima (a.k.a. peaks) have to be analyzed; to tackle this problem we have developed a three-step strategy. After data pre-processing we perform an unsupervised feature selection phase aimed at detecting salient parts of the spectra which could be useful for the subsequent classification phase. The main contribution of the paper is the development of this feature selection and extraction procedure grounded on the theory of multi-scale spaces. Then we use support vector machines for classification. Results obtained by the analysis of a data set of tumor/healthy samples allowed us to correctly classify more than 95{\%} of samples. ROC analysis has been also performed.",
author = "Michele Ceccarelli and Antonio D'Acierno and Angelo Facchiano",
year = "2009",
month = "9",
day = "28",
doi = "10.1007/978-3-642-02504-4_22",
language = "English",
isbn = "364202503X",
volume = "5488 LNBI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "242--252",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - A machine learning approach to mass spectra classification with unsupervised feature selection

AU - Ceccarelli, Michele

AU - D'Acierno, Antonio

AU - Facchiano, Angelo

PY - 2009/9/28

Y1 - 2009/9/28

N2 - Mass spectrometry spectra are recognized as a screening tool for detecting discriminatory protein patterns. Mass spectra, however, are high dimensional data and a large number of local maxima (a.k.a. peaks) have to be analyzed; to tackle this problem we have developed a three-step strategy. After data pre-processing we perform an unsupervised feature selection phase aimed at detecting salient parts of the spectra which could be useful for the subsequent classification phase. The main contribution of the paper is the development of this feature selection and extraction procedure grounded on the theory of multi-scale spaces. Then we use support vector machines for classification. Results obtained by the analysis of a data set of tumor/healthy samples allowed us to correctly classify more than 95% of samples. ROC analysis has been also performed.

AB - Mass spectrometry spectra are recognized as a screening tool for detecting discriminatory protein patterns. Mass spectra, however, are high dimensional data and a large number of local maxima (a.k.a. peaks) have to be analyzed; to tackle this problem we have developed a three-step strategy. After data pre-processing we perform an unsupervised feature selection phase aimed at detecting salient parts of the spectra which could be useful for the subsequent classification phase. The main contribution of the paper is the development of this feature selection and extraction procedure grounded on the theory of multi-scale spaces. Then we use support vector machines for classification. Results obtained by the analysis of a data set of tumor/healthy samples allowed us to correctly classify more than 95% of samples. ROC analysis has been also performed.

UR - http://www.scopus.com/inward/record.url?scp=70349309557&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349309557&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-02504-4_22

DO - 10.1007/978-3-642-02504-4_22

M3 - Conference contribution

SN - 364202503X

SN - 9783642025037

VL - 5488 LNBI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 242

EP - 252

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -