Do not trust the trolls

Predicting credibility in community question answering forums

Preslav Nakov, Tsvetomila Mihaylova, Lluis Marques, Yashkumar Shiroya, Ivan Koychev

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted. First, we motivate the problem and we create a publicly available annotated English corpus by crowd-sourcing. Second, we propose a large set of features to predict the credibility of the answers. The features model the user, the answer, the question, the thread as a whole, and the interaction between them. Our experiments with ranking SVMs show that the credibility labels can be predicted with high performance according to several standard IR ranking metrics, thus supporting the potential usage of this layer of credibility information in practical applications. The features modeling the profile of the user (in particular trollness) turn out to be most important, but embedding features modeling the answer and the similarity between the question and the answer are also very relevant. Overall, half of the gap between the baseline performance and the perfect classifier can be covered using the proposed features.

Original languageEnglish
Title of host publicationInternational Conference on Recent Advances in Natural Language Processing
Subtitle of host publicationMeet Deep Learning, RANLP 2017 - Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages551-560
Number of pages10
Volume2017-September
ISBN (Electronic)9789544520489
DOIs
Publication statusPublished - 1 Jan 2017
Event11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017 - Varna, Bulgaria
Duration: 2 Sep 20178 Sep 2017

Other

Other11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017
CountryBulgaria
CityVarna
Period2/9/178/9/17

Fingerprint

Labels
Classifiers
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Cite this

Nakov, P., Mihaylova, T., Marques, L., Shiroya, Y., & Koychev, I. (2017). Do not trust the trolls: Predicting credibility in community question answering forums. In International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings (Vol. 2017-September, pp. 551-560). Association for Computational Linguistics (ACL). https://doi.org/10.26615/978-954-452-049-6-072

Do not trust the trolls : Predicting credibility in community question answering forums. / Nakov, Preslav; Mihaylova, Tsvetomila; Marques, Lluis; Shiroya, Yashkumar; Koychev, Ivan.

International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September Association for Computational Linguistics (ACL), 2017. p. 551-560.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nakov, P, Mihaylova, T, Marques, L, Shiroya, Y & Koychev, I 2017, Do not trust the trolls: Predicting credibility in community question answering forums. in International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. vol. 2017-September, Association for Computational Linguistics (ACL), pp. 551-560, 11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, 2/9/17. https://doi.org/10.26615/978-954-452-049-6-072
Nakov P, Mihaylova T, Marques L, Shiroya Y, Koychev I. Do not trust the trolls: Predicting credibility in community question answering forums. In International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September. Association for Computational Linguistics (ACL). 2017. p. 551-560 https://doi.org/10.26615/978-954-452-049-6-072
Nakov, Preslav ; Mihaylova, Tsvetomila ; Marques, Lluis ; Shiroya, Yashkumar ; Koychev, Ivan. / Do not trust the trolls : Predicting credibility in community question answering forums. International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September Association for Computational Linguistics (ACL), 2017. pp. 551-560
@inproceedings{c23a44243bbc4775bde63ef84ebfbcd0,
title = "Do not trust the trolls: Predicting credibility in community question answering forums",
abstract = "We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted. First, we motivate the problem and we create a publicly available annotated English corpus by crowd-sourcing. Second, we propose a large set of features to predict the credibility of the answers. The features model the user, the answer, the question, the thread as a whole, and the interaction between them. Our experiments with ranking SVMs show that the credibility labels can be predicted with high performance according to several standard IR ranking metrics, thus supporting the potential usage of this layer of credibility information in practical applications. The features modeling the profile of the user (in particular trollness) turn out to be most important, but embedding features modeling the answer and the similarity between the question and the answer are also very relevant. Overall, half of the gap between the baseline performance and the perfect classifier can be covered using the proposed features.",
author = "Preslav Nakov and Tsvetomila Mihaylova and Lluis Marques and Yashkumar Shiroya and Ivan Koychev",
year = "2017",
month = "1",
day = "1",
doi = "10.26615/978-954-452-049-6-072",
language = "English",
volume = "2017-September",
pages = "551--560",
booktitle = "International Conference on Recent Advances in Natural Language Processing",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Do not trust the trolls

T2 - Predicting credibility in community question answering forums

AU - Nakov, Preslav

AU - Mihaylova, Tsvetomila

AU - Marques, Lluis

AU - Shiroya, Yashkumar

AU - Koychev, Ivan

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted. First, we motivate the problem and we create a publicly available annotated English corpus by crowd-sourcing. Second, we propose a large set of features to predict the credibility of the answers. The features model the user, the answer, the question, the thread as a whole, and the interaction between them. Our experiments with ranking SVMs show that the credibility labels can be predicted with high performance according to several standard IR ranking metrics, thus supporting the potential usage of this layer of credibility information in practical applications. The features modeling the profile of the user (in particular trollness) turn out to be most important, but embedding features modeling the answer and the similarity between the question and the answer are also very relevant. Overall, half of the gap between the baseline performance and the perfect classifier can be covered using the proposed features.

AB - We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted. First, we motivate the problem and we create a publicly available annotated English corpus by crowd-sourcing. Second, we propose a large set of features to predict the credibility of the answers. The features model the user, the answer, the question, the thread as a whole, and the interaction between them. Our experiments with ranking SVMs show that the credibility labels can be predicted with high performance according to several standard IR ranking metrics, thus supporting the potential usage of this layer of credibility information in practical applications. The features modeling the profile of the user (in particular trollness) turn out to be most important, but embedding features modeling the answer and the similarity between the question and the answer are also very relevant. Overall, half of the gap between the baseline performance and the perfect classifier can be covered using the proposed features.

UR - http://www.scopus.com/inward/record.url?scp=85045725334&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045725334&partnerID=8YFLogxK

U2 - 10.26615/978-954-452-049-6-072

DO - 10.26615/978-954-452-049-6-072

M3 - Conference contribution

VL - 2017-September

SP - 551

EP - 560

BT - International Conference on Recent Advances in Natural Language Processing

PB - Association for Computational Linguistics (ACL)

ER -