A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation

Michele Ceccarelli, Antonio Maratea

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Many indexes have been proposed in literature for the comparison of two crisp data partitions, as resulting from two different classifications attempts, two different clustering solutions or the comparison of a predicted vs. a true labeling. Most of these indexes implementations have a computational cost of O(N 2) (where N is the number of data points) and this fact may limit their usage in very big datasets or their integration in computational-intensive validation strategies. Furthermore, their extension to fuzzy partitions is not obvious. In this paper we analyze efficient algorithms to compute many classical indexes (most notably the Jaccard coefficient and the Rand index) in O(d 2∈+∈N) (where d is the number of different classes/clusters) and propose a straightforward procedure to extend their computation to fuzzy partitions. The fuzzy extension is based on a pseudo-count concept and provides a natural framework for including memberships in computation of binary similarity indexes, not limited to the ones here revised. Results on simulated data using the Jaccard coefficient highlight a higher consistence of its proposed fuzzy extension with respect to its crisp counterpart.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages755-763
Number of pages9
Volume5179 LNAI
EditionPART 3
DOIs
Publication statusPublished - 24 Dec 2008
Externally publishedYes
Event12th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2008 - Zagreb, Croatia
Duration: 3 Sep 20085 Sep 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 3
Volume5179 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2008
CountryCroatia
CityZagreb
Period3/9/085/9/08

Fingerprint

Concordance
Efficient Algorithms
Fuzzy Partition
Labeling
Similarity Index
Coefficient
Computational Cost
Costs
Count
Partition
Clustering
Binary

Keywords

  • Cluster stability
  • Concordance measure
  • Efficient algorithm
  • Validity index

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Ceccarelli, M., & Maratea, A. (2008). A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 3 ed., Vol. 5179 LNAI, pp. 755-763). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5179 LNAI, No. PART 3). https://doi.org/10.1007/978-3-540-85567-5-94

A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation. / Ceccarelli, Michele; Maratea, Antonio.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5179 LNAI PART 3. ed. 2008. p. 755-763 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5179 LNAI, No. PART 3).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ceccarelli, M & Maratea, A 2008, A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 3 edn, vol. 5179 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 3, vol. 5179 LNAI, pp. 755-763, 12th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2008, Zagreb, Croatia, 3/9/08. https://doi.org/10.1007/978-3-540-85567-5-94
Ceccarelli M, Maratea A. A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 3 ed. Vol. 5179 LNAI. 2008. p. 755-763. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3). https://doi.org/10.1007/978-3-540-85567-5-94
Ceccarelli, Michele ; Maratea, Antonio. / A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5179 LNAI PART 3. ed. 2008. pp. 755-763 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3).
@inproceedings{c346546b905044aeb12aca66ec39ae24,
title = "A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation",
abstract = "Many indexes have been proposed in literature for the comparison of two crisp data partitions, as resulting from two different classifications attempts, two different clustering solutions or the comparison of a predicted vs. a true labeling. Most of these indexes implementations have a computational cost of O(N 2) (where N is the number of data points) and this fact may limit their usage in very big datasets or their integration in computational-intensive validation strategies. Furthermore, their extension to fuzzy partitions is not obvious. In this paper we analyze efficient algorithms to compute many classical indexes (most notably the Jaccard coefficient and the Rand index) in O(d 2∈+∈N) (where d is the number of different classes/clusters) and propose a straightforward procedure to extend their computation to fuzzy partitions. The fuzzy extension is based on a pseudo-count concept and provides a natural framework for including memberships in computation of binary similarity indexes, not limited to the ones here revised. Results on simulated data using the Jaccard coefficient highlight a higher consistence of its proposed fuzzy extension with respect to its crisp counterpart.",
keywords = "Cluster stability, Concordance measure, Efficient algorithm, Validity index",
author = "Michele Ceccarelli and Antonio Maratea",
year = "2008",
month = "12",
day = "24",
doi = "10.1007/978-3-540-85567-5-94",
language = "English",
isbn = "3540855661",
volume = "5179 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 3",
pages = "755--763",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 3",

}

TY - GEN

T1 - A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation

AU - Ceccarelli, Michele

AU - Maratea, Antonio

PY - 2008/12/24

Y1 - 2008/12/24

N2 - Many indexes have been proposed in literature for the comparison of two crisp data partitions, as resulting from two different classifications attempts, two different clustering solutions or the comparison of a predicted vs. a true labeling. Most of these indexes implementations have a computational cost of O(N 2) (where N is the number of data points) and this fact may limit their usage in very big datasets or their integration in computational-intensive validation strategies. Furthermore, their extension to fuzzy partitions is not obvious. In this paper we analyze efficient algorithms to compute many classical indexes (most notably the Jaccard coefficient and the Rand index) in O(d 2∈+∈N) (where d is the number of different classes/clusters) and propose a straightforward procedure to extend their computation to fuzzy partitions. The fuzzy extension is based on a pseudo-count concept and provides a natural framework for including memberships in computation of binary similarity indexes, not limited to the ones here revised. Results on simulated data using the Jaccard coefficient highlight a higher consistence of its proposed fuzzy extension with respect to its crisp counterpart.

AB - Many indexes have been proposed in literature for the comparison of two crisp data partitions, as resulting from two different classifications attempts, two different clustering solutions or the comparison of a predicted vs. a true labeling. Most of these indexes implementations have a computational cost of O(N 2) (where N is the number of data points) and this fact may limit their usage in very big datasets or their integration in computational-intensive validation strategies. Furthermore, their extension to fuzzy partitions is not obvious. In this paper we analyze efficient algorithms to compute many classical indexes (most notably the Jaccard coefficient and the Rand index) in O(d 2∈+∈N) (where d is the number of different classes/clusters) and propose a straightforward procedure to extend their computation to fuzzy partitions. The fuzzy extension is based on a pseudo-count concept and provides a natural framework for including memberships in computation of binary similarity indexes, not limited to the ones here revised. Results on simulated data using the Jaccard coefficient highlight a higher consistence of its proposed fuzzy extension with respect to its crisp counterpart.

KW - Cluster stability

KW - Concordance measure

KW - Efficient algorithm

KW - Validity index

UR - http://www.scopus.com/inward/record.url?scp=57749174020&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57749174020&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-85567-5-94

DO - 10.1007/978-3-540-85567-5-94

M3 - Conference contribution

SN - 3540855661

SN - 9783540855668

VL - 5179 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 755

EP - 763

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -