Large-scale goodness polarity lexicons for community question answering

Todor Mihaylov, Daniel Balchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments, so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline, and state-of-The art performance on SemEval-2016 Task 3.

Original languageEnglish
Title of host publicationSIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1185-1188
Number of pages4
ISBN (Electronic)9781450350228
DOIs
Publication statusPublished - 7 Aug 2017
Event40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017 - Tokyo, Shinjuku, Japan
Duration: 7 Aug 201711 Aug 2017

Other

Other40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017
CountryJapan
CityTokyo, Shinjuku
Period7/8/1711/8/17

Keywords

  • Community Question Answering
  • Goodness polarity lexicons
  • Sentiment Analysis.

ASJC Scopus subject areas

  • Information Systems
  • Software
  • Computer Graphics and Computer-Aided Design

Cite this

Mihaylov, T., Balchev, D., Kiprov, Y., Koychev, I., & Nakov, P. (2017). Large-scale goodness polarity lexicons for community question answering. In SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1185-1188). Association for Computing Machinery, Inc. https://doi.org/10.1145/3077136.3080757

Large-scale goodness polarity lexicons for community question answering. / Mihaylov, Todor; Balchev, Daniel; Kiprov, Yasen; Koychev, Ivan; Nakov, Preslav.

SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2017. p. 1185-1188.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mihaylov, T, Balchev, D, Kiprov, Y, Koychev, I & Nakov, P 2017, Large-scale goodness polarity lexicons for community question answering. in SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, pp. 1185-1188, 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017, Tokyo, Shinjuku, Japan, 7/8/17. https://doi.org/10.1145/3077136.3080757
Mihaylov T, Balchev D, Kiprov Y, Koychev I, Nakov P. Large-scale goodness polarity lexicons for community question answering. In SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc. 2017. p. 1185-1188 https://doi.org/10.1145/3077136.3080757
Mihaylov, Todor ; Balchev, Daniel ; Kiprov, Yasen ; Koychev, Ivan ; Nakov, Preslav. / Large-scale goodness polarity lexicons for community question answering. SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2017. pp. 1185-1188
@inproceedings{012d7743f5ad4ac3bc1b0900c2744432,
title = "Large-scale goodness polarity lexicons for community question answering",
abstract = "We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments, so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline, and state-of-The art performance on SemEval-2016 Task 3.",
keywords = "Community Question Answering, Goodness polarity lexicons, Sentiment Analysis.",
author = "Todor Mihaylov and Daniel Balchev and Yasen Kiprov and Ivan Koychev and Preslav Nakov",
year = "2017",
month = "8",
day = "7",
doi = "10.1145/3077136.3080757",
language = "English",
pages = "1185--1188",
booktitle = "SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Large-scale goodness polarity lexicons for community question answering

AU - Mihaylov, Todor

AU - Balchev, Daniel

AU - Kiprov, Yasen

AU - Koychev, Ivan

AU - Nakov, Preslav

PY - 2017/8/7

Y1 - 2017/8/7

N2 - We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments, so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline, and state-of-The art performance on SemEval-2016 Task 3.

AB - We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments, so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline, and state-of-The art performance on SemEval-2016 Task 3.

KW - Community Question Answering

KW - Goodness polarity lexicons

KW - Sentiment Analysis.

UR - http://www.scopus.com/inward/record.url?scp=85029362799&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029362799&partnerID=8YFLogxK

U2 - 10.1145/3077136.3080757

DO - 10.1145/3077136.3080757

M3 - Conference contribution

SP - 1185

EP - 1188

BT - SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery, Inc

ER -