On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

Original languageEnglish
Title of host publicationSIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1209-1212
Number of pages4
ISBN (Electronic)9781450350228
DOIs
Publication statusPublished - 7 Aug 2017
Event40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017 - Tokyo, Shinjuku, Japan
Duration: 7 Aug 201711 Aug 2017

Other

Other40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017
CountryJapan
CityTokyo, Shinjuku
Period7/8/1711/8/17

Fingerprint

Information retrieval

Keywords

  • Community question answering
  • Crowdsourcing
  • Learning to rank
  • Relevance assessment

ASJC Scopus subject areas

  • Information Systems
  • Software
  • Computer Graphics and Computer-Aided Design

Cite this

Barron, A., Martino, G., Filice, S., & Moschitti, A. (2017). On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. In SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1209-1212). Association for Computing Machinery, Inc. https://doi.org/10.1145/3077136.3080763

On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. / Barron, Alberto; Martino, Giovanni; Filice, Simone; Moschitti, Alessandro.

SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2017. p. 1209-1212.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Barron, A, Martino, G, Filice, S & Moschitti, A 2017, On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. in SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, pp. 1209-1212, 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017, Tokyo, Shinjuku, Japan, 7/8/17. https://doi.org/10.1145/3077136.3080763
Barron A, Martino G, Filice S, Moschitti A. On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. In SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc. 2017. p. 1209-1212 https://doi.org/10.1145/3077136.3080763
Barron, Alberto ; Martino, Giovanni ; Filice, Simone ; Moschitti, Alessandro. / On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2017. pp. 1209-1212
@inproceedings{13d40ea2b45e4f2f81f5f11a0daee8c4,
title = "On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments",
abstract = "In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.",
keywords = "Community question answering, Crowdsourcing, Learning to rank, Relevance assessment",
author = "Alberto Barron and Giovanni Martino and Simone Filice and Alessandro Moschitti",
year = "2017",
month = "8",
day = "7",
doi = "10.1145/3077136.3080763",
language = "English",
pages = "1209--1212",
booktitle = "SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments

AU - Barron, Alberto

AU - Martino, Giovanni

AU - Filice, Simone

AU - Moschitti, Alessandro

PY - 2017/8/7

Y1 - 2017/8/7

N2 - In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

AB - In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

KW - Community question answering

KW - Crowdsourcing

KW - Learning to rank

KW - Relevance assessment

UR - http://www.scopus.com/inward/record.url?scp=85029364749&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029364749&partnerID=8YFLogxK

U2 - 10.1145/3077136.3080763

DO - 10.1145/3077136.3080763

M3 - Conference contribution

SP - 1209

EP - 1212

BT - SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery, Inc

ER -