Combined syntactic and semantic kernels for text classification

Stephan Bloehdorn, Alessandro Moschitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Citations (Scopus)

Abstract

The exploitation of syntactic structures and semantic background knowledge has always been an appealing subject in the context of text retrieval and information management. The usefulness of this kind of information has been shown most prominently in highly specialized tasks, such as classification in Question Answering (QA) scenarios. So far, however, additional syntactic or semantic information has been used only individually. In this paper, we propose a principled approach for jointly exploiting both types of information. We propose a new type of kernel, the Semantic Syntactic Tree Kernel (SSTK), which incorporates linguistic structures, e.g. syntactic dependencies, and semantic background knowledge, e.g. term similarity based on WordNet, to automatically learn question categories in QA. We show the power of this approach in a series of experiments with a well known Question Classification dataset.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages307-318
Number of pages12
Volume4425 LNCS
Publication statusPublished - 20 Dec 2007
Externally publishedYes
Event29th European Conference on IR Research, ECIR 2007 - Rome, Italy
Duration: 2 Apr 20075 Apr 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4425 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other29th European Conference on IR Research, ECIR 2007
CountryItaly
CityRome
Period2/4/075/4/07

Fingerprint

Text Classification
Syntactics
Semantics
kernel
Question Answering
Text Retrieval
Information Management
WordNet
Linguistics
Information management
Exploitation
Scenarios
Series
Syntax
Term
Experiment
Experiments
Background
Knowledge

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Bloehdorn, S., & Moschitti, A. (2007). Combined syntactic and semantic kernels for text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4425 LNCS, pp. 307-318). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4425 LNCS).

Combined syntactic and semantic kernels for text classification. / Bloehdorn, Stephan; Moschitti, Alessandro.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4425 LNCS 2007. p. 307-318 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4425 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bloehdorn, S & Moschitti, A 2007, Combined syntactic and semantic kernels for text classification. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4425 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4425 LNCS, pp. 307-318, 29th European Conference on IR Research, ECIR 2007, Rome, Italy, 2/4/07.
Bloehdorn S, Moschitti A. Combined syntactic and semantic kernels for text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4425 LNCS. 2007. p. 307-318. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Bloehdorn, Stephan ; Moschitti, Alessandro. / Combined syntactic and semantic kernels for text classification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4425 LNCS 2007. pp. 307-318 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{e0b4cce4cf1c44dba439db2b24281a25,
title = "Combined syntactic and semantic kernels for text classification",
abstract = "The exploitation of syntactic structures and semantic background knowledge has always been an appealing subject in the context of text retrieval and information management. The usefulness of this kind of information has been shown most prominently in highly specialized tasks, such as classification in Question Answering (QA) scenarios. So far, however, additional syntactic or semantic information has been used only individually. In this paper, we propose a principled approach for jointly exploiting both types of information. We propose a new type of kernel, the Semantic Syntactic Tree Kernel (SSTK), which incorporates linguistic structures, e.g. syntactic dependencies, and semantic background knowledge, e.g. term similarity based on WordNet, to automatically learn question categories in QA. We show the power of this approach in a series of experiments with a well known Question Classification dataset.",
author = "Stephan Bloehdorn and Alessandro Moschitti",
year = "2007",
month = "12",
day = "20",
language = "English",
isbn = "3540714944",
volume = "4425 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "307--318",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Combined syntactic and semantic kernels for text classification

AU - Bloehdorn, Stephan

AU - Moschitti, Alessandro

PY - 2007/12/20

Y1 - 2007/12/20

N2 - The exploitation of syntactic structures and semantic background knowledge has always been an appealing subject in the context of text retrieval and information management. The usefulness of this kind of information has been shown most prominently in highly specialized tasks, such as classification in Question Answering (QA) scenarios. So far, however, additional syntactic or semantic information has been used only individually. In this paper, we propose a principled approach for jointly exploiting both types of information. We propose a new type of kernel, the Semantic Syntactic Tree Kernel (SSTK), which incorporates linguistic structures, e.g. syntactic dependencies, and semantic background knowledge, e.g. term similarity based on WordNet, to automatically learn question categories in QA. We show the power of this approach in a series of experiments with a well known Question Classification dataset.

AB - The exploitation of syntactic structures and semantic background knowledge has always been an appealing subject in the context of text retrieval and information management. The usefulness of this kind of information has been shown most prominently in highly specialized tasks, such as classification in Question Answering (QA) scenarios. So far, however, additional syntactic or semantic information has been used only individually. In this paper, we propose a principled approach for jointly exploiting both types of information. We propose a new type of kernel, the Semantic Syntactic Tree Kernel (SSTK), which incorporates linguistic structures, e.g. syntactic dependencies, and semantic background knowledge, e.g. term similarity based on WordNet, to automatically learn question categories in QA. We show the power of this approach in a series of experiments with a well known Question Classification dataset.

UR - http://www.scopus.com/inward/record.url?scp=37149055744&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37149055744&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:37149055744

SN - 3540714944

SN - 9783540714941

VL - 4425 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 307

EP - 318

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -