Semantic tree kernels for statistical natural language learning

Danilo Croce, Roberto Basili, Alessandro Moschitti

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

A central topic in Natural Language Processing (NLP) is the design of effective linguistic processors suitable for the target applications. Within this scenario, Convolution Kernels provide a powerful method to directly apply Machine Learning algorithms to complex structures representing linguistic information. The main topic of this work is the definition of the semantically Smoothed Partial Tree Kernel (SPTK), a generalized formulation of one of the most performant Convolution Kernels, i.e. the Tree Kernel (TK), by extending the similarity between tree structures with node similarities. The main characteristic of SPTK is its ability to measure the similarity between syntactic tree structures, which are partially similar and whose nodes can differ but are nevertheless semantically related. One of the most important outcomes is that SPTK allows for embedding external lexical information in the kernel function only through a similarity function among lexical nodes. The SPTK has been evaluated in three complex automatic Semantic Processing tasks: Question Classification in Question Answering, Verb Classification and Semantic Role Labeling. Although these tasks address different problems, state-of-the-art results have been achieved in every evaluation.

Original languageEnglish
Pages (from-to)93-113
Number of pages21
JournalStudies in Computational Intelligence
Volume589
DOIs
Publication statusPublished - 2015
Externally publishedYes

Fingerprint

Convolution
Linguistics
Semantics
Syntactics
Processing
Labeling
Learning algorithms
Learning systems

Keywords

  • classification
  • Kernel methods
  • Semantic role labeling Verb
  • Tree kernels

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Semantic tree kernels for statistical natural language learning. / Croce, Danilo; Basili, Roberto; Moschitti, Alessandro.

In: Studies in Computational Intelligence, Vol. 589, 2015, p. 93-113.

Research output: Contribution to journalArticle

@article{b0110ea07a4a43cc8fdcf5d3bbfde2c0,
title = "Semantic tree kernels for statistical natural language learning",
abstract = "A central topic in Natural Language Processing (NLP) is the design of effective linguistic processors suitable for the target applications. Within this scenario, Convolution Kernels provide a powerful method to directly apply Machine Learning algorithms to complex structures representing linguistic information. The main topic of this work is the definition of the semantically Smoothed Partial Tree Kernel (SPTK), a generalized formulation of one of the most performant Convolution Kernels, i.e. the Tree Kernel (TK), by extending the similarity between tree structures with node similarities. The main characteristic of SPTK is its ability to measure the similarity between syntactic tree structures, which are partially similar and whose nodes can differ but are nevertheless semantically related. One of the most important outcomes is that SPTK allows for embedding external lexical information in the kernel function only through a similarity function among lexical nodes. The SPTK has been evaluated in three complex automatic Semantic Processing tasks: Question Classification in Question Answering, Verb Classification and Semantic Role Labeling. Although these tasks address different problems, state-of-the-art results have been achieved in every evaluation.",
keywords = "classification, Kernel methods, Semantic role labeling Verb, Tree kernels",
author = "Danilo Croce and Roberto Basili and Alessandro Moschitti",
year = "2015",
doi = "10.1007/978-3-319-14206-7_5",
language = "English",
volume = "589",
pages = "93--113",
journal = "Studies in Computational Intelligence",
issn = "1860-949X",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Semantic tree kernels for statistical natural language learning

AU - Croce, Danilo

AU - Basili, Roberto

AU - Moschitti, Alessandro

PY - 2015

Y1 - 2015

N2 - A central topic in Natural Language Processing (NLP) is the design of effective linguistic processors suitable for the target applications. Within this scenario, Convolution Kernels provide a powerful method to directly apply Machine Learning algorithms to complex structures representing linguistic information. The main topic of this work is the definition of the semantically Smoothed Partial Tree Kernel (SPTK), a generalized formulation of one of the most performant Convolution Kernels, i.e. the Tree Kernel (TK), by extending the similarity between tree structures with node similarities. The main characteristic of SPTK is its ability to measure the similarity between syntactic tree structures, which are partially similar and whose nodes can differ but are nevertheless semantically related. One of the most important outcomes is that SPTK allows for embedding external lexical information in the kernel function only through a similarity function among lexical nodes. The SPTK has been evaluated in three complex automatic Semantic Processing tasks: Question Classification in Question Answering, Verb Classification and Semantic Role Labeling. Although these tasks address different problems, state-of-the-art results have been achieved in every evaluation.

AB - A central topic in Natural Language Processing (NLP) is the design of effective linguistic processors suitable for the target applications. Within this scenario, Convolution Kernels provide a powerful method to directly apply Machine Learning algorithms to complex structures representing linguistic information. The main topic of this work is the definition of the semantically Smoothed Partial Tree Kernel (SPTK), a generalized formulation of one of the most performant Convolution Kernels, i.e. the Tree Kernel (TK), by extending the similarity between tree structures with node similarities. The main characteristic of SPTK is its ability to measure the similarity between syntactic tree structures, which are partially similar and whose nodes can differ but are nevertheless semantically related. One of the most important outcomes is that SPTK allows for embedding external lexical information in the kernel function only through a similarity function among lexical nodes. The SPTK has been evaluated in three complex automatic Semantic Processing tasks: Question Classification in Question Answering, Verb Classification and Semantic Role Labeling. Although these tasks address different problems, state-of-the-art results have been achieved in every evaluation.

KW - classification

KW - Kernel methods

KW - Semantic role labeling Verb

KW - Tree kernels

UR - http://www.scopus.com/inward/record.url?scp=84926654992&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84926654992&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-14206-7_5

DO - 10.1007/978-3-319-14206-7_5

M3 - Article

VL - 589

SP - 93

EP - 113

JO - Studies in Computational Intelligence

JF - Studies in Computational Intelligence

SN - 1860-949X

ER -