Is there a best quality metric for graph clusters?

Hélio Almeida, Dorgival Guedes, Wagner Meira, Mohammed J. Zaki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

46 Citations (Scopus)

Abstract

Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages44-59
Number of pages16
Volume6911 LNAI
EditionPART 1
DOIs
Publication statusPublished - 9 Sep 2011
Externally publishedYes
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011 - Athens, Greece
Duration: 5 Sep 20119 Sep 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6911 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

OtherEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011
CountryGreece
CityAthens
Period5/9/119/9/11

Fingerprint

Graph Clustering
Clustering algorithms
Metric
Clustering Algorithm
Graph in graph theory
Quality Evaluation
Graph Algorithms
Clique
Scenarios
Evaluation

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Almeida, H., Guedes, D., Meira, W., & Zaki, M. J. (2011). Is there a best quality metric for graph clusters? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 1 ed., Vol. 6911 LNAI, pp. 44-59). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6911 LNAI, No. PART 1). https://doi.org/10.1007/978-3-642-23780-5_13

Is there a best quality metric for graph clusters? / Almeida, Hélio; Guedes, Dorgival; Meira, Wagner; Zaki, Mohammed J.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6911 LNAI PART 1. ed. 2011. p. 44-59 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6911 LNAI, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Almeida, H, Guedes, D, Meira, W & Zaki, MJ 2011, Is there a best quality metric for graph clusters? in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 1 edn, vol. 6911 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 6911 LNAI, pp. 44-59, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011, Athens, Greece, 5/9/11. https://doi.org/10.1007/978-3-642-23780-5_13
Almeida H, Guedes D, Meira W, Zaki MJ. Is there a best quality metric for graph clusters? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 1 ed. Vol. 6911 LNAI. 2011. p. 44-59. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-642-23780-5_13
Almeida, Hélio ; Guedes, Dorgival ; Meira, Wagner ; Zaki, Mohammed J. / Is there a best quality metric for graph clusters?. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6911 LNAI PART 1. ed. 2011. pp. 44-59 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{8249822e5c43488ebf52c5312f74c901,
title = "Is there a best quality metric for graph clusters?",
abstract = "Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.",
author = "H{\'e}lio Almeida and Dorgival Guedes and Wagner Meira and Zaki, {Mohammed J.}",
year = "2011",
month = "9",
day = "9",
doi = "10.1007/978-3-642-23780-5_13",
language = "English",
isbn = "9783642237799",
volume = "6911 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 1",
pages = "44--59",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 1",

}

TY - GEN

T1 - Is there a best quality metric for graph clusters?

AU - Almeida, Hélio

AU - Guedes, Dorgival

AU - Meira, Wagner

AU - Zaki, Mohammed J.

PY - 2011/9/9

Y1 - 2011/9/9

N2 - Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.

AB - Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.

UR - http://www.scopus.com/inward/record.url?scp=80052398024&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052398024&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-23780-5_13

DO - 10.1007/978-3-642-23780-5_13

M3 - Conference contribution

AN - SCOPUS:80052398024

SN - 9783642237799

VL - 6911 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 44

EP - 59

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -