Large-scale learning with structural kernels for class-imbalanced datasets

Aliaksei Severyn, Alessandro Moschitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of the art results in many application areas, e.g. natural language processing (NLP), bio-informatics, computer vision and many others. The use of kernels to capture relationships in the input data has made Support Vector Machine (SVM) algorithm the state of the art tool in many application areas. Nevertheless, kernel learning remains a computationally expensive process. The contribution of this paper is to make learning with structural kernels, e.g. tree kernels, more applicable to real-world large-scale tasks. More specifically, we propose two important enhancements of the approximate cutting plane algorithm to train Support Vector Machines with structural kernels: (i) a new sampling strategy to handle class-imbalanced problem; and (ii) a parallel implementation, which makes the training scale almost linearly with the number of CPUs. We also show that theoretical convergence bounds are preserved for the improved algorithm. The experimental evaluations demonstrate the soundness of our approach and the possibility to carry out large-scale learning with structural kernels.

Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
Pages34-41
Number of pages8
Volume255 CCIS
DOIs
Publication statusPublished - 27 Aug 2012
Externally publishedYes
Event1st International Workshop on Eternal Systems, EternalS 2011 - Budapest, Hungary
Duration: 3 May 20113 May 2011

Publication series

NameCommunications in Computer and Information Science
Volume255 CCIS
ISSN (Print)18650929

Other

Other1st International Workshop on Eternal Systems, EternalS 2011
CountryHungary
CityBudapest
Period3/5/113/5/11

Fingerprint

Support vector machines
Bioinformatics
Computer vision
Program processors
Learning systems
Sampling
Processing

Keywords

  • Kernel Methods
  • Machine Learning
  • Natural Language Processing
  • Structural Kernels
  • Support Vector Machine

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Severyn, A., & Moschitti, A. (2012). Large-scale learning with structural kernels for class-imbalanced datasets. In Communications in Computer and Information Science (Vol. 255 CCIS, pp. 34-41). (Communications in Computer and Information Science; Vol. 255 CCIS). https://doi.org/10.1007/978-3-642-28033-7_4

Large-scale learning with structural kernels for class-imbalanced datasets. / Severyn, Aliaksei; Moschitti, Alessandro.

Communications in Computer and Information Science. Vol. 255 CCIS 2012. p. 34-41 (Communications in Computer and Information Science; Vol. 255 CCIS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Severyn, A & Moschitti, A 2012, Large-scale learning with structural kernels for class-imbalanced datasets. in Communications in Computer and Information Science. vol. 255 CCIS, Communications in Computer and Information Science, vol. 255 CCIS, pp. 34-41, 1st International Workshop on Eternal Systems, EternalS 2011, Budapest, Hungary, 3/5/11. https://doi.org/10.1007/978-3-642-28033-7_4
Severyn A, Moschitti A. Large-scale learning with structural kernels for class-imbalanced datasets. In Communications in Computer and Information Science. Vol. 255 CCIS. 2012. p. 34-41. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-642-28033-7_4
Severyn, Aliaksei ; Moschitti, Alessandro. / Large-scale learning with structural kernels for class-imbalanced datasets. Communications in Computer and Information Science. Vol. 255 CCIS 2012. pp. 34-41 (Communications in Computer and Information Science).
@inproceedings{7c56a4a6d32a48e596d61f73060cbf05,
title = "Large-scale learning with structural kernels for class-imbalanced datasets",
abstract = "Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of the art results in many application areas, e.g. natural language processing (NLP), bio-informatics, computer vision and many others. The use of kernels to capture relationships in the input data has made Support Vector Machine (SVM) algorithm the state of the art tool in many application areas. Nevertheless, kernel learning remains a computationally expensive process. The contribution of this paper is to make learning with structural kernels, e.g. tree kernels, more applicable to real-world large-scale tasks. More specifically, we propose two important enhancements of the approximate cutting plane algorithm to train Support Vector Machines with structural kernels: (i) a new sampling strategy to handle class-imbalanced problem; and (ii) a parallel implementation, which makes the training scale almost linearly with the number of CPUs. We also show that theoretical convergence bounds are preserved for the improved algorithm. The experimental evaluations demonstrate the soundness of our approach and the possibility to carry out large-scale learning with structural kernels.",
keywords = "Kernel Methods, Machine Learning, Natural Language Processing, Structural Kernels, Support Vector Machine",
author = "Aliaksei Severyn and Alessandro Moschitti",
year = "2012",
month = "8",
day = "27",
doi = "10.1007/978-3-642-28033-7_4",
language = "English",
isbn = "9783642280320",
volume = "255 CCIS",
series = "Communications in Computer and Information Science",
pages = "34--41",
booktitle = "Communications in Computer and Information Science",

}

TY - GEN

T1 - Large-scale learning with structural kernels for class-imbalanced datasets

AU - Severyn, Aliaksei

AU - Moschitti, Alessandro

PY - 2012/8/27

Y1 - 2012/8/27

N2 - Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of the art results in many application areas, e.g. natural language processing (NLP), bio-informatics, computer vision and many others. The use of kernels to capture relationships in the input data has made Support Vector Machine (SVM) algorithm the state of the art tool in many application areas. Nevertheless, kernel learning remains a computationally expensive process. The contribution of this paper is to make learning with structural kernels, e.g. tree kernels, more applicable to real-world large-scale tasks. More specifically, we propose two important enhancements of the approximate cutting plane algorithm to train Support Vector Machines with structural kernels: (i) a new sampling strategy to handle class-imbalanced problem; and (ii) a parallel implementation, which makes the training scale almost linearly with the number of CPUs. We also show that theoretical convergence bounds are preserved for the improved algorithm. The experimental evaluations demonstrate the soundness of our approach and the possibility to carry out large-scale learning with structural kernels.

AB - Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of the art results in many application areas, e.g. natural language processing (NLP), bio-informatics, computer vision and many others. The use of kernels to capture relationships in the input data has made Support Vector Machine (SVM) algorithm the state of the art tool in many application areas. Nevertheless, kernel learning remains a computationally expensive process. The contribution of this paper is to make learning with structural kernels, e.g. tree kernels, more applicable to real-world large-scale tasks. More specifically, we propose two important enhancements of the approximate cutting plane algorithm to train Support Vector Machines with structural kernels: (i) a new sampling strategy to handle class-imbalanced problem; and (ii) a parallel implementation, which makes the training scale almost linearly with the number of CPUs. We also show that theoretical convergence bounds are preserved for the improved algorithm. The experimental evaluations demonstrate the soundness of our approach and the possibility to carry out large-scale learning with structural kernels.

KW - Kernel Methods

KW - Machine Learning

KW - Natural Language Processing

KW - Structural Kernels

KW - Support Vector Machine

UR - http://www.scopus.com/inward/record.url?scp=84865211823&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84865211823&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-28033-7_4

DO - 10.1007/978-3-642-28033-7_4

M3 - Conference contribution

SN - 9783642280320

VL - 255 CCIS

T3 - Communications in Computer and Information Science

SP - 34

EP - 41

BT - Communications in Computer and Information Science

ER -