A multiple-instance learning approach to sentence selection for question ranking

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

Original languageEnglish
Title of host publicationAdvances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings
PublisherSpringer Verlag
Pages437-449
Number of pages13
Volume10193 LNCS
ISBN (Print)9783319566078
DOIs
Publication statusPublished - 2017
Event39th European Conference on Information Retrieval, ECIR 2017 - Aberdeen, United Kingdom
Duration: 8 Apr 201713 Apr 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10193 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other39th European Conference on Information Retrieval, ECIR 2017
CountryUnited Kingdom
City Aberdeen
Period8/4/1713/4/17

Fingerprint

Redundancy
Ranking
Question Answering
Retrieval
Query
Fragment
kernel
Scenarios
Learning
Text
Community
Model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Romeo, S., Martino, G., Barron, A., & Moschitti, A. (2017). A multiple-instance learning approach to sentence selection for question ranking. In Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings (Vol. 10193 LNCS, pp. 437-449). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10193 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-56608-5_34

A multiple-instance learning approach to sentence selection for question ranking. / Romeo, Salvatore; Martino, Giovanni; Barron, Alberto; Moschitti, Alessandro.

Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings. Vol. 10193 LNCS Springer Verlag, 2017. p. 437-449 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10193 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Romeo, S, Martino, G, Barron, A & Moschitti, A 2017, A multiple-instance learning approach to sentence selection for question ranking. in Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings. vol. 10193 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10193 LNCS, Springer Verlag, pp. 437-449, 39th European Conference on Information Retrieval, ECIR 2017, Aberdeen, United Kingdom, 8/4/17. https://doi.org/10.1007/978-3-319-56608-5_34
Romeo S, Martino G, Barron A, Moschitti A. A multiple-instance learning approach to sentence selection for question ranking. In Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings. Vol. 10193 LNCS. Springer Verlag. 2017. p. 437-449. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-56608-5_34
Romeo, Salvatore ; Martino, Giovanni ; Barron, Alberto ; Moschitti, Alessandro. / A multiple-instance learning approach to sentence selection for question ranking. Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings. Vol. 10193 LNCS Springer Verlag, 2017. pp. 437-449 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{9d37120ef4594b75bc7e2ad3a06b36bb,
title = "A multiple-instance learning approach to sentence selection for question ranking",
abstract = "In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.",
author = "Salvatore Romeo and Giovanni Martino and Alberto Barron and Alessandro Moschitti",
year = "2017",
doi = "10.1007/978-3-319-56608-5_34",
language = "English",
isbn = "9783319566078",
volume = "10193 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "437--449",
booktitle = "Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings",

}

TY - GEN

T1 - A multiple-instance learning approach to sentence selection for question ranking

AU - Romeo, Salvatore

AU - Martino, Giovanni

AU - Barron, Alberto

AU - Moschitti, Alessandro

PY - 2017

Y1 - 2017

N2 - In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

AB - In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

UR - http://www.scopus.com/inward/record.url?scp=85018699240&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018699240&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-56608-5_34

DO - 10.1007/978-3-319-56608-5_34

M3 - Conference contribution

SN - 9783319566078

VL - 10193 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 437

EP - 449

BT - Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2017, Proceedings

PB - Springer Verlag

ER -