Kernel-based learning to rank with syntactic and semantic structures

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In recent years, machine learning (ML) has been more and more used to solve complex tasks in di erent disciplines,ranging from Data Mining to Information Retrieval (IR) or Natural Language Processing (NLP). These tasks often require the processing of structured input. For example, NLP applications critically deal with syntactic and seman-tic structures. Modeling the latter in terms of feature vec-tors for ML algorithms requires large expertise, intuition and deep knowledge about the target linguistic phenomenon. KernelMethods (KMs) are powerfulML techniques (see e.g., [5]), which can alleviate the data representation problem as they substitute scalar product between feature vectors with similarity functions (kernels) directly dened between training/test instances, e.g., syntactic trees, (thus features are not needed anymore). Additionally, kernel engineering, i.e., the composition or adaptation of several prototype ker-nels, facilitates the design of the similarity functions required for new tasks, e.g., [1, 2]. KMs can be very valuable for IR research, e.g., KMs allow us to easily exploit syntac-tic/semantic structures, e.g., dependency, constituency or shallow semantic structures, in learning to rank algorithms [3, 4]. In general, KMs can make easier the use of NLP techniques in IR tasks. This tutorial aims at introducing essential and simpli-ed theory of Support Vector Machines (SVMs) and KMs for the design of practical applications. It describes ef-fective kernels for easily engineering automatic classiersand learning to rank algorithms, also using structured dataand semantic processing. Some examples are drawn fromwell-known tasks, i.e., Question Answering and Passage Re-ranking, Short and Long Text Categorization, Relation Ex-traction, Named Entity Recognition, Co-Reference Resolu-tion. Moreover, some practical demonstrations are givenwith SVM-Light-TK (tree kernel) toolkit. More in detail, best practices for successfully using KMs for IR and NLPare presented according to the following outline: (i) a very brief introduction to SVMs (explained from an application viewpoint) and KM theory (the essential content for understanding practical procedures). (ii) Presentation of kernel engineering building blocks, such as linear, polynomial, lexical, sequence and tree kernels, by focusing on their function, accuracy and eciency rather than their mathematical characterization, so that they can be easily understood. (iii) Illustration of important applications for which ker-nels achieve the state of the art, i.e., Question Classica-tion, Question and Answer (passage) Reranking, Relation Extraction, coreference resolution and hierarchical text cat-egorization. In this perspective kernels for reranking will be presented as an ecient and eective approach to learning dependencies between structured input and output. (iv) Practical exercise on quick design of ML systems us-ing SVM-Light-TK toolkit, which encodes several kernels in SVMs. (v) Summary of the key points to engineer innovative and eective kernels starting from basic kernels and using sys-tematic data transformations. (vi) Presentation of the latest KM ndings: kernel-based learning on large-scale with fast SVMs, generalized struc-tural and semantic kernels and reverse kernel engineering.

Original languageEnglish
Title of host publicationSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
DOIs
Publication statusPublished - 2 Sep 2013
Event36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 - Dublin, Ireland
Duration: 28 Jul 20131 Aug 2013

Other

Other36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013
CountryIreland
CityDublin
Period28/7/131/8/13

Fingerprint

Syntactics
Support vector machines
Semantics
Information retrieval
Learning systems
Processing
Reverse engineering
Trees (mathematics)
Linguistics
Learning algorithms
Data mining
Demonstrations
Polynomials
Engineers
Chemical analysis

Keywords

  • Kernel Methods
  • Large-Scale Learning
  • Question Answering
  • Structural Kernels
  • Support Vector Machines

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Moschitti, A. (2013). Kernel-based learning to rank with syntactic and semantic structures. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval https://doi.org/10.1145/2484028.2484196

Kernel-based learning to rank with syntactic and semantic structures. / Moschitti, Alessandro.

SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Moschitti, A 2013, Kernel-based learning to rank with syntactic and semantic structures. in SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, Dublin, Ireland, 28/7/13. https://doi.org/10.1145/2484028.2484196
Moschitti A. Kernel-based learning to rank with syntactic and semantic structures. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013 https://doi.org/10.1145/2484028.2484196
Moschitti, Alessandro. / Kernel-based learning to rank with syntactic and semantic structures. SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013.
@inproceedings{8a058b9721134f7299eb079fbac66c35,
title = "Kernel-based learning to rank with syntactic and semantic structures",
abstract = "In recent years, machine learning (ML) has been more and more used to solve complex tasks in di erent disciplines,ranging from Data Mining to Information Retrieval (IR) or Natural Language Processing (NLP). These tasks often require the processing of structured input. For example, NLP applications critically deal with syntactic and seman-tic structures. Modeling the latter in terms of feature vec-tors for ML algorithms requires large expertise, intuition and deep knowledge about the target linguistic phenomenon. KernelMethods (KMs) are powerfulML techniques (see e.g., [5]), which can alleviate the data representation problem as they substitute scalar product between feature vectors with similarity functions (kernels) directly dened between training/test instances, e.g., syntactic trees, (thus features are not needed anymore). Additionally, kernel engineering, i.e., the composition or adaptation of several prototype ker-nels, facilitates the design of the similarity functions required for new tasks, e.g., [1, 2]. KMs can be very valuable for IR research, e.g., KMs allow us to easily exploit syntac-tic/semantic structures, e.g., dependency, constituency or shallow semantic structures, in learning to rank algorithms [3, 4]. In general, KMs can make easier the use of NLP techniques in IR tasks. This tutorial aims at introducing essential and simpli-ed theory of Support Vector Machines (SVMs) and KMs for the design of practical applications. It describes ef-fective kernels for easily engineering automatic classiersand learning to rank algorithms, also using structured dataand semantic processing. Some examples are drawn fromwell-known tasks, i.e., Question Answering and Passage Re-ranking, Short and Long Text Categorization, Relation Ex-traction, Named Entity Recognition, Co-Reference Resolu-tion. Moreover, some practical demonstrations are givenwith SVM-Light-TK (tree kernel) toolkit. More in detail, best practices for successfully using KMs for IR and NLPare presented according to the following outline: (i) a very brief introduction to SVMs (explained from an application viewpoint) and KM theory (the essential content for understanding practical procedures). (ii) Presentation of kernel engineering building blocks, such as linear, polynomial, lexical, sequence and tree kernels, by focusing on their function, accuracy and eciency rather than their mathematical characterization, so that they can be easily understood. (iii) Illustration of important applications for which ker-nels achieve the state of the art, i.e., Question Classica-tion, Question and Answer (passage) Reranking, Relation Extraction, coreference resolution and hierarchical text cat-egorization. In this perspective kernels for reranking will be presented as an ecient and eective approach to learning dependencies between structured input and output. (iv) Practical exercise on quick design of ML systems us-ing SVM-Light-TK toolkit, which encodes several kernels in SVMs. (v) Summary of the key points to engineer innovative and eective kernels starting from basic kernels and using sys-tematic data transformations. (vi) Presentation of the latest KM ndings: kernel-based learning on large-scale with fast SVMs, generalized struc-tural and semantic kernels and reverse kernel engineering.",
keywords = "Kernel Methods, Large-Scale Learning, Question Answering, Structural Kernels, Support Vector Machines",
author = "Alessandro Moschitti",
year = "2013",
month = "9",
day = "2",
doi = "10.1145/2484028.2484196",
language = "English",
isbn = "9781450320344",
booktitle = "SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval",

}

TY - GEN

T1 - Kernel-based learning to rank with syntactic and semantic structures

AU - Moschitti, Alessandro

PY - 2013/9/2

Y1 - 2013/9/2

N2 - In recent years, machine learning (ML) has been more and more used to solve complex tasks in di erent disciplines,ranging from Data Mining to Information Retrieval (IR) or Natural Language Processing (NLP). These tasks often require the processing of structured input. For example, NLP applications critically deal with syntactic and seman-tic structures. Modeling the latter in terms of feature vec-tors for ML algorithms requires large expertise, intuition and deep knowledge about the target linguistic phenomenon. KernelMethods (KMs) are powerfulML techniques (see e.g., [5]), which can alleviate the data representation problem as they substitute scalar product between feature vectors with similarity functions (kernels) directly dened between training/test instances, e.g., syntactic trees, (thus features are not needed anymore). Additionally, kernel engineering, i.e., the composition or adaptation of several prototype ker-nels, facilitates the design of the similarity functions required for new tasks, e.g., [1, 2]. KMs can be very valuable for IR research, e.g., KMs allow us to easily exploit syntac-tic/semantic structures, e.g., dependency, constituency or shallow semantic structures, in learning to rank algorithms [3, 4]. In general, KMs can make easier the use of NLP techniques in IR tasks. This tutorial aims at introducing essential and simpli-ed theory of Support Vector Machines (SVMs) and KMs for the design of practical applications. It describes ef-fective kernels for easily engineering automatic classiersand learning to rank algorithms, also using structured dataand semantic processing. Some examples are drawn fromwell-known tasks, i.e., Question Answering and Passage Re-ranking, Short and Long Text Categorization, Relation Ex-traction, Named Entity Recognition, Co-Reference Resolu-tion. Moreover, some practical demonstrations are givenwith SVM-Light-TK (tree kernel) toolkit. More in detail, best practices for successfully using KMs for IR and NLPare presented according to the following outline: (i) a very brief introduction to SVMs (explained from an application viewpoint) and KM theory (the essential content for understanding practical procedures). (ii) Presentation of kernel engineering building blocks, such as linear, polynomial, lexical, sequence and tree kernels, by focusing on their function, accuracy and eciency rather than their mathematical characterization, so that they can be easily understood. (iii) Illustration of important applications for which ker-nels achieve the state of the art, i.e., Question Classica-tion, Question and Answer (passage) Reranking, Relation Extraction, coreference resolution and hierarchical text cat-egorization. In this perspective kernels for reranking will be presented as an ecient and eective approach to learning dependencies between structured input and output. (iv) Practical exercise on quick design of ML systems us-ing SVM-Light-TK toolkit, which encodes several kernels in SVMs. (v) Summary of the key points to engineer innovative and eective kernels starting from basic kernels and using sys-tematic data transformations. (vi) Presentation of the latest KM ndings: kernel-based learning on large-scale with fast SVMs, generalized struc-tural and semantic kernels and reverse kernel engineering.

AB - In recent years, machine learning (ML) has been more and more used to solve complex tasks in di erent disciplines,ranging from Data Mining to Information Retrieval (IR) or Natural Language Processing (NLP). These tasks often require the processing of structured input. For example, NLP applications critically deal with syntactic and seman-tic structures. Modeling the latter in terms of feature vec-tors for ML algorithms requires large expertise, intuition and deep knowledge about the target linguistic phenomenon. KernelMethods (KMs) are powerfulML techniques (see e.g., [5]), which can alleviate the data representation problem as they substitute scalar product between feature vectors with similarity functions (kernels) directly dened between training/test instances, e.g., syntactic trees, (thus features are not needed anymore). Additionally, kernel engineering, i.e., the composition or adaptation of several prototype ker-nels, facilitates the design of the similarity functions required for new tasks, e.g., [1, 2]. KMs can be very valuable for IR research, e.g., KMs allow us to easily exploit syntac-tic/semantic structures, e.g., dependency, constituency or shallow semantic structures, in learning to rank algorithms [3, 4]. In general, KMs can make easier the use of NLP techniques in IR tasks. This tutorial aims at introducing essential and simpli-ed theory of Support Vector Machines (SVMs) and KMs for the design of practical applications. It describes ef-fective kernels for easily engineering automatic classiersand learning to rank algorithms, also using structured dataand semantic processing. Some examples are drawn fromwell-known tasks, i.e., Question Answering and Passage Re-ranking, Short and Long Text Categorization, Relation Ex-traction, Named Entity Recognition, Co-Reference Resolu-tion. Moreover, some practical demonstrations are givenwith SVM-Light-TK (tree kernel) toolkit. More in detail, best practices for successfully using KMs for IR and NLPare presented according to the following outline: (i) a very brief introduction to SVMs (explained from an application viewpoint) and KM theory (the essential content for understanding practical procedures). (ii) Presentation of kernel engineering building blocks, such as linear, polynomial, lexical, sequence and tree kernels, by focusing on their function, accuracy and eciency rather than their mathematical characterization, so that they can be easily understood. (iii) Illustration of important applications for which ker-nels achieve the state of the art, i.e., Question Classica-tion, Question and Answer (passage) Reranking, Relation Extraction, coreference resolution and hierarchical text cat-egorization. In this perspective kernels for reranking will be presented as an ecient and eective approach to learning dependencies between structured input and output. (iv) Practical exercise on quick design of ML systems us-ing SVM-Light-TK toolkit, which encodes several kernels in SVMs. (v) Summary of the key points to engineer innovative and eective kernels starting from basic kernels and using sys-tematic data transformations. (vi) Presentation of the latest KM ndings: kernel-based learning on large-scale with fast SVMs, generalized struc-tural and semantic kernels and reverse kernel engineering.

KW - Kernel Methods

KW - Large-Scale Learning

KW - Question Answering

KW - Structural Kernels

KW - Support Vector Machines

UR - http://www.scopus.com/inward/record.url?scp=84883059791&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883059791&partnerID=8YFLogxK

U2 - 10.1145/2484028.2484196

DO - 10.1145/2484028.2484196

M3 - Conference contribution

SN - 9781450320344

BT - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

ER -