Class confidence weighted kNN algorithms for imbalanced data sets

Wei Liu, Sanjay Chawla

Research output: Chapter in Book/Report/Conference proceedingConference contribution

80 Citations (Scopus)

Abstract

In this paper, a novel k-nearest neighbors (kNN) weighting strategy is proposed for handling the problem of class imbalance. When dealing with highly imbalanced data, a salient drawback of existing kNN algorithms is that the class with more frequent samples tends to dominate the neighborhood of a test instance in spite of distance measurements, which leads to suboptimal classification performance on the minority class. To solve this problem, we propose CCW (class confidence weights) that uses the probability of attribute values given class labels to weight prototypes in kNN. The main advantage of CCW is that it is able to correct the inherent bias to majority class in existing kNN algorithms on any distance measurement. Theoretical analysis and comprehensive experiments confirm our claims.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages345-356
Number of pages12
Volume6635 LNAI
EditionPART 2
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011 - Shenzhen
Duration: 24 May 201127 May 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6635 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011
CityShenzhen
Period24/5/1127/5/11

    Fingerprint

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Liu, W., & Chawla, S. (2011). Class confidence weighted kNN algorithms for imbalanced data sets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 2 ed., Vol. 6635 LNAI, pp. 345-356). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6635 LNAI, No. PART 2). https://doi.org/10.1007/978-3-642-20847-8-29