Competence-conscious associative classification

Adriano Veloso, Mohammed Zaki, Wagner Meira, Marcos Gonçalves

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e. confidence, correlation, etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-dependent and rarely known while designing the classifier. This uncertainty concerning the optimal match between metrics and problems is a dilemma, and prevents associative classifiers to achieve their maximal performance. This dilemma is the focus of this paper. A possible solution to this dilemma is to learn the competence, expertise, or assertiveness of metrics. The basic idea is that each metric has a specific sub-domain for which it is most competent (i.e. it consistently produces more accurate classifiers than the ones produced by other metrics). Particularly, we investigate stacking-based meta-learning methods, which use the training data to find the domain of competence of each metric. The meta-classifier describes the domains of competence (or areas of expertise) of each metric, enabling a more sensible use of these metrics so that competence-conscious classifiers can be produced (i.e. a metric is only used to produce classifiers for test instances that belong to its domain of competence). We conducted a systematic and comprehensive evaluation, using different datasets and evaluation measures, of classifiers produced by different metrics. The result is that, while no metric is always superior than all others, the selection of appropriate metrics according to their competence/expertise (i.e. competence-conscious associative classifiers) seems very effective, showing gains that range from 1.2% to 26.3% when compared with the baselines (SVMs and an existing ensemble method).

Original languageEnglish
Pages (from-to)361-377
Number of pages17
JournalStatistical Analysis and Data Mining
Volume2
Issue number5-6
DOIs
Publication statusPublished - 1 Dec 2009
Externally publishedYes

Fingerprint

Classifiers
Metric
Classifier
Dilemma
Expertise
Meta-learning
Ensemble Methods
Comprehensive Evaluation
Dependent Data
Stacking
Statistics
Confidence
Statistic
Baseline
Quantify
Uncertainty
Dependent
Prediction

Keywords

  • Associative classification
  • Machine learning
  • Meta-learning

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Analysis

Cite this

Veloso, A., Zaki, M., Meira, W., & Gonçalves, M. (2009). Competence-conscious associative classification. Statistical Analysis and Data Mining, 2(5-6), 361-377. https://doi.org/10.1002/sam.10058

Competence-conscious associative classification. / Veloso, Adriano; Zaki, Mohammed; Meira, Wagner; Gonçalves, Marcos.

In: Statistical Analysis and Data Mining, Vol. 2, No. 5-6, 01.12.2009, p. 361-377.

Research output: Contribution to journalArticle

Veloso, A, Zaki, M, Meira, W & Gonçalves, M 2009, 'Competence-conscious associative classification', Statistical Analysis and Data Mining, vol. 2, no. 5-6, pp. 361-377. https://doi.org/10.1002/sam.10058
Veloso, Adriano ; Zaki, Mohammed ; Meira, Wagner ; Gonçalves, Marcos. / Competence-conscious associative classification. In: Statistical Analysis and Data Mining. 2009 ; Vol. 2, No. 5-6. pp. 361-377.
@article{38f48a0584874045b0d2283aba471c9b,
title = "Competence-conscious associative classification",
abstract = "The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e. confidence, correlation, etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-dependent and rarely known while designing the classifier. This uncertainty concerning the optimal match between metrics and problems is a dilemma, and prevents associative classifiers to achieve their maximal performance. This dilemma is the focus of this paper. A possible solution to this dilemma is to learn the competence, expertise, or assertiveness of metrics. The basic idea is that each metric has a specific sub-domain for which it is most competent (i.e. it consistently produces more accurate classifiers than the ones produced by other metrics). Particularly, we investigate stacking-based meta-learning methods, which use the training data to find the domain of competence of each metric. The meta-classifier describes the domains of competence (or areas of expertise) of each metric, enabling a more sensible use of these metrics so that competence-conscious classifiers can be produced (i.e. a metric is only used to produce classifiers for test instances that belong to its domain of competence). We conducted a systematic and comprehensive evaluation, using different datasets and evaluation measures, of classifiers produced by different metrics. The result is that, while no metric is always superior than all others, the selection of appropriate metrics according to their competence/expertise (i.e. competence-conscious associative classifiers) seems very effective, showing gains that range from 1.2{\%} to 26.3{\%} when compared with the baselines (SVMs and an existing ensemble method).",
keywords = "Associative classification, Machine learning, Meta-learning",
author = "Adriano Veloso and Mohammed Zaki and Wagner Meira and Marcos Gon{\cc}alves",
year = "2009",
month = "12",
day = "1",
doi = "10.1002/sam.10058",
language = "English",
volume = "2",
pages = "361--377",
journal = "Statistical Analysis and Data Mining",
issn = "1932-1872",
publisher = "John Wiley and Sons Inc.",
number = "5-6",

}

TY - JOUR

T1 - Competence-conscious associative classification

AU - Veloso, Adriano

AU - Zaki, Mohammed

AU - Meira, Wagner

AU - Gonçalves, Marcos

PY - 2009/12/1

Y1 - 2009/12/1

N2 - The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e. confidence, correlation, etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-dependent and rarely known while designing the classifier. This uncertainty concerning the optimal match between metrics and problems is a dilemma, and prevents associative classifiers to achieve their maximal performance. This dilemma is the focus of this paper. A possible solution to this dilemma is to learn the competence, expertise, or assertiveness of metrics. The basic idea is that each metric has a specific sub-domain for which it is most competent (i.e. it consistently produces more accurate classifiers than the ones produced by other metrics). Particularly, we investigate stacking-based meta-learning methods, which use the training data to find the domain of competence of each metric. The meta-classifier describes the domains of competence (or areas of expertise) of each metric, enabling a more sensible use of these metrics so that competence-conscious classifiers can be produced (i.e. a metric is only used to produce classifiers for test instances that belong to its domain of competence). We conducted a systematic and comprehensive evaluation, using different datasets and evaluation measures, of classifiers produced by different metrics. The result is that, while no metric is always superior than all others, the selection of appropriate metrics according to their competence/expertise (i.e. competence-conscious associative classifiers) seems very effective, showing gains that range from 1.2% to 26.3% when compared with the baselines (SVMs and an existing ensemble method).

AB - The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e. confidence, correlation, etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-dependent and rarely known while designing the classifier. This uncertainty concerning the optimal match between metrics and problems is a dilemma, and prevents associative classifiers to achieve their maximal performance. This dilemma is the focus of this paper. A possible solution to this dilemma is to learn the competence, expertise, or assertiveness of metrics. The basic idea is that each metric has a specific sub-domain for which it is most competent (i.e. it consistently produces more accurate classifiers than the ones produced by other metrics). Particularly, we investigate stacking-based meta-learning methods, which use the training data to find the domain of competence of each metric. The meta-classifier describes the domains of competence (or areas of expertise) of each metric, enabling a more sensible use of these metrics so that competence-conscious classifiers can be produced (i.e. a metric is only used to produce classifiers for test instances that belong to its domain of competence). We conducted a systematic and comprehensive evaluation, using different datasets and evaluation measures, of classifiers produced by different metrics. The result is that, while no metric is always superior than all others, the selection of appropriate metrics according to their competence/expertise (i.e. competence-conscious associative classifiers) seems very effective, showing gains that range from 1.2% to 26.3% when compared with the baselines (SVMs and an existing ensemble method).

KW - Associative classification

KW - Machine learning

KW - Meta-learning

UR - http://www.scopus.com/inward/record.url?scp=77950208743&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950208743&partnerID=8YFLogxK

U2 - 10.1002/sam.10058

DO - 10.1002/sam.10058

M3 - Article

AN - SCOPUS:77950208743

VL - 2

SP - 361

EP - 377

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

SN - 1932-1872

IS - 5-6

ER -