Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks

RaghvenPhDa Mall, Rocco Langone, Johan A.K. Suykens

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Kernel spectral clustering corresponds to a weighted kernel principal component analysis problem in a constrained optimization framework. The primal formulation leads to an eigen-decomposition of a centered Laplacian matrix at the dual level. The dual formulation allows to build a model on a representative subgraph of the large scale network in the training phase and the model parameters are estimated in the validation stage. The KSC model has a powerful out-of-sample extension property which allows cluster affiliation for the unseen nodes of the big data network. In this paper we exploit the structure of the projections in the eigenspace during the validation stage to automatically determine a set of increasing distance thresholds. We use these distance thresholds in the test phase to obtain multiple levels of hierarchy for the large scale network. The hierarchical structure in the network is determined in a bottom-up fashion. We empirically showcase that real-world networks have multilevel hierarchical organization which cannot be detected efficiently by several state-of-theart large scale hierarchical community detection techniques like the Louvain, OSLOM and Infomap methods. We show that a major advantage of our proposed approach is the ability to locate good quality clusters at both the finer and coarser levels of hierarchy using internal cluster quality metrics on 7 real-life networks.

Original languageEnglish
Article numbere99966
JournalPLoS One
Volume9
Issue number6
DOIs
Publication statusPublished - 20 Jun 2014
Externally publishedYes

Fingerprint

Complex networks
Principal Component Analysis
Cluster Analysis
seeds
system optimization
Constrained optimization
Principal component analysis
principal component analysis
Decomposition
degradation
methodology
testing
sampling

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks. / Mall, RaghvenPhDa; Langone, Rocco; Suykens, Johan A.K.

In: PLoS One, Vol. 9, No. 6, e99966, 20.06.2014.

Research output: Contribution to journalArticle

@article{43f90592b4374802b1be91bd1c2915ca,
title = "Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks",
abstract = "Kernel spectral clustering corresponds to a weighted kernel principal component analysis problem in a constrained optimization framework. The primal formulation leads to an eigen-decomposition of a centered Laplacian matrix at the dual level. The dual formulation allows to build a model on a representative subgraph of the large scale network in the training phase and the model parameters are estimated in the validation stage. The KSC model has a powerful out-of-sample extension property which allows cluster affiliation for the unseen nodes of the big data network. In this paper we exploit the structure of the projections in the eigenspace during the validation stage to automatically determine a set of increasing distance thresholds. We use these distance thresholds in the test phase to obtain multiple levels of hierarchy for the large scale network. The hierarchical structure in the network is determined in a bottom-up fashion. We empirically showcase that real-world networks have multilevel hierarchical organization which cannot be detected efficiently by several state-of-theart large scale hierarchical community detection techniques like the Louvain, OSLOM and Infomap methods. We show that a major advantage of our proposed approach is the ability to locate good quality clusters at both the finer and coarser levels of hierarchy using internal cluster quality metrics on 7 real-life networks.",
author = "RaghvenPhDa Mall and Rocco Langone and Suykens, {Johan A.K.}",
year = "2014",
month = "6",
day = "20",
doi = "10.1371/journal.pone.0099966",
language = "English",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "6",

}

TY - JOUR

T1 - Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks

AU - Mall, RaghvenPhDa

AU - Langone, Rocco

AU - Suykens, Johan A.K.

PY - 2014/6/20

Y1 - 2014/6/20

N2 - Kernel spectral clustering corresponds to a weighted kernel principal component analysis problem in a constrained optimization framework. The primal formulation leads to an eigen-decomposition of a centered Laplacian matrix at the dual level. The dual formulation allows to build a model on a representative subgraph of the large scale network in the training phase and the model parameters are estimated in the validation stage. The KSC model has a powerful out-of-sample extension property which allows cluster affiliation for the unseen nodes of the big data network. In this paper we exploit the structure of the projections in the eigenspace during the validation stage to automatically determine a set of increasing distance thresholds. We use these distance thresholds in the test phase to obtain multiple levels of hierarchy for the large scale network. The hierarchical structure in the network is determined in a bottom-up fashion. We empirically showcase that real-world networks have multilevel hierarchical organization which cannot be detected efficiently by several state-of-theart large scale hierarchical community detection techniques like the Louvain, OSLOM and Infomap methods. We show that a major advantage of our proposed approach is the ability to locate good quality clusters at both the finer and coarser levels of hierarchy using internal cluster quality metrics on 7 real-life networks.

AB - Kernel spectral clustering corresponds to a weighted kernel principal component analysis problem in a constrained optimization framework. The primal formulation leads to an eigen-decomposition of a centered Laplacian matrix at the dual level. The dual formulation allows to build a model on a representative subgraph of the large scale network in the training phase and the model parameters are estimated in the validation stage. The KSC model has a powerful out-of-sample extension property which allows cluster affiliation for the unseen nodes of the big data network. In this paper we exploit the structure of the projections in the eigenspace during the validation stage to automatically determine a set of increasing distance thresholds. We use these distance thresholds in the test phase to obtain multiple levels of hierarchy for the large scale network. The hierarchical structure in the network is determined in a bottom-up fashion. We empirically showcase that real-world networks have multilevel hierarchical organization which cannot be detected efficiently by several state-of-theart large scale hierarchical community detection techniques like the Louvain, OSLOM and Infomap methods. We show that a major advantage of our proposed approach is the ability to locate good quality clusters at both the finer and coarser levels of hierarchy using internal cluster quality metrics on 7 real-life networks.

UR - http://www.scopus.com/inward/record.url?scp=84903289109&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903289109&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0099966

DO - 10.1371/journal.pone.0099966

M3 - Article

C2 - 24949877

AN - SCOPUS:84903289109

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 6

M1 - e99966

ER -