Large-scale learning with structural kernels for class-imbalanced datasets

Aliaksei Severyn, Alessandro Moschitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of the art results in many application areas, e.g. natural language processing (NLP), bio-informatics, computer vision and many others. The use of kernels to capture relationships in the input data has made Support Vector Machine (SVM) algorithm the state of the art tool in many application areas. Nevertheless, kernel learning remains a computationally expensive process. The contribution of this paper is to make learning with structural kernels, e.g. tree kernels, more applicable to real-world large-scale tasks. More specifically, we propose two important enhancements of the approximate cutting plane algorithm to train Support Vector Machines with structural kernels: (i) a new sampling strategy to handle class-imbalanced problem; and (ii) a parallel implementation, which makes the training scale almost linearly with the number of CPUs. We also show that theoretical convergence bounds are preserved for the improved algorithm. The experimental evaluations demonstrate the soundness of our approach and the possibility to carry out large-scale learning with structural kernels.

Original languageEnglish
Title of host publicationEternal Systems - First International Workshop, EternalS 2011, Revised Selected Papers
Pages34-41
Number of pages8
DOIs
Publication statusPublished - 27 Aug 2012
Event1st International Workshop on Eternal Systems, EternalS 2011 - Budapest, Hungary
Duration: 3 May 20113 May 2011

Publication series

NameCommunications in Computer and Information Science
Volume255 CCIS
ISSN (Print)1865-0929

Other

Other1st International Workshop on Eternal Systems, EternalS 2011
CountryHungary
CityBudapest
Period3/5/113/5/11

    Fingerprint

Keywords

  • Kernel Methods
  • Machine Learning
  • Natural Language Processing
  • Structural Kernels
  • Support Vector Machine

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Cite this

Severyn, A., & Moschitti, A. (2012). Large-scale learning with structural kernels for class-imbalanced datasets. In Eternal Systems - First International Workshop, EternalS 2011, Revised Selected Papers (pp. 34-41). (Communications in Computer and Information Science; Vol. 255 CCIS). https://doi.org/10.1007/978-3-642-28033-7_4