Calibrated lazy associative classification

Adriano Veloso, Wagner Meira, Marcos Gonalves, Humberto M. Almeida, Mohammed Zaki

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Classification is a popular machine learning task. Given an example x and a class c, a classifier usually works by estimating the probability of x being member of c (i.e., membership probability). Well calibrated classifiers are those able to provide accurate estimates of class membership probabilities, that is, the estimated probability p̂(c|x) is close to p(c|p̂(c|x)), which is the true, (unknown) empirical probability of x being member of c given that the probability estimated by the classifier is p̂(c|x). Calibration is not a necessary property for producing accurate classifiers, and, thus, most of the research has focused on direct accuracy maximization strategies rather than on calibration. However, non-calibrated classifiers are problematic in applications where the reliability associated with a prediction must be taken into account. In these applications, a sensible use of the classifier must be based on the reliability of its predictions, and, thus, the classifier must be well calibrated. In this paper we show that lazy associative classifiers (LAC) are well calibrated using an MDL-based entropy minimization method. We investigate important applications where such characteristics (i.e., accuracy and calibration) are relevant, and we demonstrate empirically that LAC outperforms other classifiers, such as SVMs, Naive Bayes, and Decision Trees (even after these classifiers are calibrated). Additional highlights of LAC include the ability to incorporate reliable predictions for improving training, and the ability to refrain from doubtful predictions.

Original languageEnglish
Pages (from-to)2656-2670
Number of pages15
JournalInformation Sciences
Volume181
Issue number13
DOIs
Publication statusPublished - 1 Jul 2011
Externally publishedYes

Fingerprint

Classifiers
Classifier
Calibration
Prediction
Naive Bayes
Decision trees
Decision tree
Learning systems
Machine Learning
Entropy
Unknown
Necessary

Keywords

  • Calibration
  • Classification
  • MDL

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management

Cite this

Veloso, A., Meira, W., Gonalves, M., Almeida, H. M., & Zaki, M. (2011). Calibrated lazy associative classification. Information Sciences, 181(13), 2656-2670. https://doi.org/10.1016/j.ins.2010.03.007

Calibrated lazy associative classification. / Veloso, Adriano; Meira, Wagner; Gonalves, Marcos; Almeida, Humberto M.; Zaki, Mohammed.

In: Information Sciences, Vol. 181, No. 13, 01.07.2011, p. 2656-2670.

Research output: Contribution to journalArticle

Veloso, A, Meira, W, Gonalves, M, Almeida, HM & Zaki, M 2011, 'Calibrated lazy associative classification', Information Sciences, vol. 181, no. 13, pp. 2656-2670. https://doi.org/10.1016/j.ins.2010.03.007
Veloso A, Meira W, Gonalves M, Almeida HM, Zaki M. Calibrated lazy associative classification. Information Sciences. 2011 Jul 1;181(13):2656-2670. https://doi.org/10.1016/j.ins.2010.03.007
Veloso, Adriano ; Meira, Wagner ; Gonalves, Marcos ; Almeida, Humberto M. ; Zaki, Mohammed. / Calibrated lazy associative classification. In: Information Sciences. 2011 ; Vol. 181, No. 13. pp. 2656-2670.
@article{fa86872b549844bd885c82bf8d12d19a,
title = "Calibrated lazy associative classification",
abstract = "Classification is a popular machine learning task. Given an example x and a class c, a classifier usually works by estimating the probability of x being member of c (i.e., membership probability). Well calibrated classifiers are those able to provide accurate estimates of class membership probabilities, that is, the estimated probability p̂(c|x) is close to p(c|p̂(c|x)), which is the true, (unknown) empirical probability of x being member of c given that the probability estimated by the classifier is p̂(c|x). Calibration is not a necessary property for producing accurate classifiers, and, thus, most of the research has focused on direct accuracy maximization strategies rather than on calibration. However, non-calibrated classifiers are problematic in applications where the reliability associated with a prediction must be taken into account. In these applications, a sensible use of the classifier must be based on the reliability of its predictions, and, thus, the classifier must be well calibrated. In this paper we show that lazy associative classifiers (LAC) are well calibrated using an MDL-based entropy minimization method. We investigate important applications where such characteristics (i.e., accuracy and calibration) are relevant, and we demonstrate empirically that LAC outperforms other classifiers, such as SVMs, Naive Bayes, and Decision Trees (even after these classifiers are calibrated). Additional highlights of LAC include the ability to incorporate reliable predictions for improving training, and the ability to refrain from doubtful predictions.",
keywords = "Calibration, Classification, MDL",
author = "Adriano Veloso and Wagner Meira and Marcos Gonalves and Almeida, {Humberto M.} and Mohammed Zaki",
year = "2011",
month = "7",
day = "1",
doi = "10.1016/j.ins.2010.03.007",
language = "English",
volume = "181",
pages = "2656--2670",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",
number = "13",

}

TY - JOUR

T1 - Calibrated lazy associative classification

AU - Veloso, Adriano

AU - Meira, Wagner

AU - Gonalves, Marcos

AU - Almeida, Humberto M.

AU - Zaki, Mohammed

PY - 2011/7/1

Y1 - 2011/7/1

N2 - Classification is a popular machine learning task. Given an example x and a class c, a classifier usually works by estimating the probability of x being member of c (i.e., membership probability). Well calibrated classifiers are those able to provide accurate estimates of class membership probabilities, that is, the estimated probability p̂(c|x) is close to p(c|p̂(c|x)), which is the true, (unknown) empirical probability of x being member of c given that the probability estimated by the classifier is p̂(c|x). Calibration is not a necessary property for producing accurate classifiers, and, thus, most of the research has focused on direct accuracy maximization strategies rather than on calibration. However, non-calibrated classifiers are problematic in applications where the reliability associated with a prediction must be taken into account. In these applications, a sensible use of the classifier must be based on the reliability of its predictions, and, thus, the classifier must be well calibrated. In this paper we show that lazy associative classifiers (LAC) are well calibrated using an MDL-based entropy minimization method. We investigate important applications where such characteristics (i.e., accuracy and calibration) are relevant, and we demonstrate empirically that LAC outperforms other classifiers, such as SVMs, Naive Bayes, and Decision Trees (even after these classifiers are calibrated). Additional highlights of LAC include the ability to incorporate reliable predictions for improving training, and the ability to refrain from doubtful predictions.

AB - Classification is a popular machine learning task. Given an example x and a class c, a classifier usually works by estimating the probability of x being member of c (i.e., membership probability). Well calibrated classifiers are those able to provide accurate estimates of class membership probabilities, that is, the estimated probability p̂(c|x) is close to p(c|p̂(c|x)), which is the true, (unknown) empirical probability of x being member of c given that the probability estimated by the classifier is p̂(c|x). Calibration is not a necessary property for producing accurate classifiers, and, thus, most of the research has focused on direct accuracy maximization strategies rather than on calibration. However, non-calibrated classifiers are problematic in applications where the reliability associated with a prediction must be taken into account. In these applications, a sensible use of the classifier must be based on the reliability of its predictions, and, thus, the classifier must be well calibrated. In this paper we show that lazy associative classifiers (LAC) are well calibrated using an MDL-based entropy minimization method. We investigate important applications where such characteristics (i.e., accuracy and calibration) are relevant, and we demonstrate empirically that LAC outperforms other classifiers, such as SVMs, Naive Bayes, and Decision Trees (even after these classifiers are calibrated). Additional highlights of LAC include the ability to incorporate reliable predictions for improving training, and the ability to refrain from doubtful predictions.

KW - Calibration

KW - Classification

KW - MDL

UR - http://www.scopus.com/inward/record.url?scp=79953892381&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79953892381&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2010.03.007

DO - 10.1016/j.ins.2010.03.007

M3 - Article

VL - 181

SP - 2656

EP - 2670

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

IS - 13

ER -