An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

Kuo Wei Hsu, Jaideep Srivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages28-39
Number of pages12
Volume5669 LNAI
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 - Bangkok
Duration: 27 Apr 200930 Apr 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5669 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
CityBangkok
Period27/4/0930/4/09

Fingerprint

Imperfect
Empirical Study
Ensemble
Classifiers
Classifier
Supervised learning
Supervised Learning
Resolve
Evaluate
Influence

Keywords

  • AdaBoost
  • bagging
  • diversity
  • heterogeneity
  • imbalanced data

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Hsu, K. W., & Srivastava, J. (2010). An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5669 LNAI, pp. 28-39). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5669 LNAI). https://doi.org/10.1007/978-3-642-14640-4_3

An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. / Hsu, Kuo Wei; Srivastava, Jaideep.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5669 LNAI 2010. p. 28-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5669 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hsu, KW & Srivastava, J 2010, An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5669 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5669 LNAI, pp. 28-39, 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009, Bangkok, 27/4/09. https://doi.org/10.1007/978-3-642-14640-4_3
Hsu KW, Srivastava J. An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5669 LNAI. 2010. p. 28-39. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-14640-4_3
Hsu, Kuo Wei ; Srivastava, Jaideep. / An empirical study of applying ensembles of heterogeneous classifiers on imperfect data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5669 LNAI 2010. pp. 28-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{18162ad7f8914892b21244933682c378,
title = "An empirical study of applying ensembles of heterogeneous classifiers on imperfect data",
abstract = "Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.",
keywords = "AdaBoost, bagging, diversity, heterogeneity, imbalanced data",
author = "Hsu, {Kuo Wei} and Jaideep Srivastava",
year = "2010",
doi = "10.1007/978-3-642-14640-4_3",
language = "English",
isbn = "3642146392",
volume = "5669 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "28--39",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - An empirical study of applying ensembles of heterogeneous classifiers on imperfect data

AU - Hsu, Kuo Wei

AU - Srivastava, Jaideep

PY - 2010

Y1 - 2010

N2 - Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

AB - Two factors that slow down the deployment of classification or supervised learning in real-world situations. One is the reality that data are not perfect in practice, while the other is the fact that every technique has its own limits. Although there have been techniques developed to resolve issues about imperfectness of real-world data, there is no single one that outperforms all others and each such technique focuses on some types of imperfectness. Furthermore, quite a few works apply ensembles of heterogeneous classifiers to such situations. In this paper, we report a work on progress that studies the impact of heterogeneity on ensemble, especially focusing on the following aspects: diversity and classification quality for imbalanced data. Our goal is to evaluate how introducing heterogeneity into ensemble influences its behavior and performance.

KW - AdaBoost

KW - bagging

KW - diversity

KW - heterogeneity

KW - imbalanced data

UR - http://www.scopus.com/inward/record.url?scp=77957079123&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77957079123&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-14640-4_3

DO - 10.1007/978-3-642-14640-4_3

M3 - Conference contribution

SN - 3642146392

SN - 9783642146398

VL - 5669 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 28

EP - 39

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -