Large-scale support vector learning with structural kernels

Aliaksei Severyn, Alessandro Moschitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages229-244
Number of pages16
Volume6323 LNAI
EditionPART 3
DOIs
Publication statusPublished - 25 Oct 2010
Externally publishedYes
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2010 - Barcelona, Spain
Duration: 20 Sep 201024 Sep 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 3
Volume6323 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

OtherEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2010
CountrySpain
CityBarcelona
Period20/9/1024/9/10

Fingerprint

Support Vector
Cutting Plane Algorithm
kernel
Natural Language
Optimal Kernel
Text Classification
Subgradient
Question Answering
Optimal Parameter
Real-world Applications
Large Data Sets
Predicate
Experimentation
Nonlinear Model
Sampling
Trade-offs
Learning
Partial
Processing
Experiments

Keywords

  • Natural Language Processing
  • Structural Kernels
  • Support Vector Machines

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Severyn, A., & Moschitti, A. (2010). Large-scale support vector learning with structural kernels. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 3 ed., Vol. 6323 LNAI, pp. 229-244). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6323 LNAI, No. PART 3). https://doi.org/10.1007/978-3-642-15939-8_15

Large-scale support vector learning with structural kernels. / Severyn, Aliaksei; Moschitti, Alessandro.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6323 LNAI PART 3. ed. 2010. p. 229-244 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6323 LNAI, No. PART 3).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Severyn, A & Moschitti, A 2010, Large-scale support vector learning with structural kernels. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 3 edn, vol. 6323 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 3, vol. 6323 LNAI, pp. 229-244, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2010, Barcelona, Spain, 20/9/10. https://doi.org/10.1007/978-3-642-15939-8_15
Severyn A, Moschitti A. Large-scale support vector learning with structural kernels. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 3 ed. Vol. 6323 LNAI. 2010. p. 229-244. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3). https://doi.org/10.1007/978-3-642-15939-8_15
Severyn, Aliaksei ; Moschitti, Alessandro. / Large-scale support vector learning with structural kernels. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6323 LNAI PART 3. ed. 2010. pp. 229-244 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 3).
@inproceedings{8561113c8982410683299f7d5d5b2b7d,
title = "Large-scale support vector learning with structural kernels",
abstract = "In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.",
keywords = "Natural Language Processing, Structural Kernels, Support Vector Machines",
author = "Aliaksei Severyn and Alessandro Moschitti",
year = "2010",
month = "10",
day = "25",
doi = "10.1007/978-3-642-15939-8_15",
language = "English",
isbn = "3642159389",
volume = "6323 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 3",
pages = "229--244",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 3",

}

TY - GEN

T1 - Large-scale support vector learning with structural kernels

AU - Severyn, Aliaksei

AU - Moschitti, Alessandro

PY - 2010/10/25

Y1 - 2010/10/25

N2 - In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.

AB - In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.

KW - Natural Language Processing

KW - Structural Kernels

KW - Support Vector Machines

UR - http://www.scopus.com/inward/record.url?scp=77958058028&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77958058028&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-15939-8_15

DO - 10.1007/978-3-642-15939-8_15

M3 - Conference contribution

SN - 3642159389

SN - 9783642159381

VL - 6323 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 229

EP - 244

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -