Identification of answer-seeking questions in Arabic microblogs

Maram Hasanain, Tamer Elsayed, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets. In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.

Original languageEnglish
Title of host publicationCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery, Inc
Pages1839-1842
Number of pages4
ISBN (Print)9781450325981
DOIs
Publication statusPublished - 3 Nov 2014
Event23rd ACM International Conference on Information and Knowledge Management, CIKM 2014 - Shanghai, China
Duration: 3 Nov 20147 Nov 2014

Other

Other23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
CountryChina
CityShanghai
Period3/11/147/11/14

Fingerprint

Classifiers
Communication
Twitter
Classifier
Microblogging
Rule-based
Filter
Leverage
Hub

Keywords

  • Arabic
  • Crowdsourcing
  • Question identification
  • Twitter

ASJC Scopus subject areas

  • Information Systems and Management
  • Computer Science Applications
  • Information Systems

Cite this

Hasanain, M., Elsayed, T., & Magdy, W. (2014). Identification of answer-seeking questions in Arabic microblogs. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp. 1839-1842). Association for Computing Machinery, Inc. https://doi.org/10.1145/2661829.2661959

Identification of answer-seeking questions in Arabic microblogs. / Hasanain, Maram; Elsayed, Tamer; Magdy, Walid.

CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2014. p. 1839-1842.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hasanain, M, Elsayed, T & Magdy, W 2014, Identification of answer-seeking questions in Arabic microblogs. in CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, pp. 1839-1842, 23rd ACM International Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, 3/11/14. https://doi.org/10.1145/2661829.2661959
Hasanain M, Elsayed T, Magdy W. Identification of answer-seeking questions in Arabic microblogs. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc. 2014. p. 1839-1842 https://doi.org/10.1145/2661829.2661959
Hasanain, Maram ; Elsayed, Tamer ; Magdy, Walid. / Identification of answer-seeking questions in Arabic microblogs. CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2014. pp. 1839-1842
@inproceedings{6a94cdb964a74761a2a42266ac396505,
title = "Identification of answer-seeking questions in Arabic microblogs",
abstract = "Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets. In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64{\%} recall with 80{\%} precision in identifying tweets with answer-seeking questions.",
keywords = "Arabic, Crowdsourcing, Question identification, Twitter",
author = "Maram Hasanain and Tamer Elsayed and Walid Magdy",
year = "2014",
month = "11",
day = "3",
doi = "10.1145/2661829.2661959",
language = "English",
isbn = "9781450325981",
pages = "1839--1842",
booktitle = "CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Identification of answer-seeking questions in Arabic microblogs

AU - Hasanain, Maram

AU - Elsayed, Tamer

AU - Magdy, Walid

PY - 2014/11/3

Y1 - 2014/11/3

N2 - Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets. In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.

AB - Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets. In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.

KW - Arabic

KW - Crowdsourcing

KW - Question identification

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=84937605996&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937605996&partnerID=8YFLogxK

U2 - 10.1145/2661829.2661959

DO - 10.1145/2661829.2661959

M3 - Conference contribution

AN - SCOPUS:84937605996

SN - 9781450325981

SP - 1839

EP - 1842

BT - CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management

PB - Association for Computing Machinery, Inc

ER -