Kernel spectral document clustering using unsupervised precision-recall metrics

RaghvenPhDa Mall, Johan A.K. Suykens

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Kernel Spectral Clustering (KSC) solves a weighted kernel principal component analysis problem in a primal-dual optimization framework. The KSC model is built on a small subset of data using a proper training, model selection and a test phase. The clustering model is obtained using the dual solution of the problem and has a powerful out-of-sample extensions property which allows cluster affiliation for previously unseen data points. In the model selection phase, we estimate the appropriate number of clusters using a metric that evaluates the quality of the clusters. Traditional quality indices like inertia, Davies-Bouldin (DB) index and silhouette (SIL) are known to be method-dependent and not perform well in case of complex heterogeneous data like textual data. In this paper, we utilize the quality evaluation techniques based on an unsupervised version of Precision, Recall and F-measure proposed in [1] to come up with a new kernel spectral document clustering (KSDC) model which generates homogeneous clusters of documents. We compare the quality of the clusters obtained by the proposed KSDC technique with k-means and neural gas algorithm, which are more oriented towards these metrics, on several real world textual data.

Original languageEnglish
Title of host publication2015 International Joint Conference on Neural Networks, IJCNN 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Volume2015-September
ISBN (Electronic)9781479919604, 9781479919604, 9781479919604, 9781479919604
DOIs
Publication statusPublished - 28 Sep 2015
Externally publishedYes
EventInternational Joint Conference on Neural Networks, IJCNN 2015 - Killarney, Ireland
Duration: 12 Jul 201517 Jul 2015

Other

OtherInternational Joint Conference on Neural Networks, IJCNN 2015
CountryIreland
CityKillarney
Period12/7/1517/7/15

    Fingerprint

Keywords

  • Decoding
  • Measurement

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Mall, R., & Suykens, J. A. K. (2015). Kernel spectral document clustering using unsupervised precision-recall metrics. In 2015 International Joint Conference on Neural Networks, IJCNN 2015 (Vol. 2015-September). [7280654] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IJCNN.2015.7280654