Improving tweet timeline generation by predicting optimal retrieval depth

Maram Hasanain, Tamer Elsayed, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Tweet Timeline Generation (TTG) systems provide users with informative and concise summaries of topics, as they developed over time, in a retrospective manner. In order to produce a tweet timeline that constitutes a summary of a given topic, a TTG system typically retrieves a list of potentially-relevant tweets over which the timeline is eventually generated. In such design, dependency of the performance of the timeline generation step on that of the retrieval step is inevitable. In this work, we aim at improving the performance of a given timeline generation system by controlling the depth of the ranked list of retrieved tweets considered in generating the timeline. We propose a supervised approach in which we predict the optimal depth of the ranked tweet list for a given topic by combining estimates of list quality computed at different depths. We conducted our experiments on a recent TREC TTG test collection of 243M tweets and 55 topics. We experimented with 14 different retrieval models (used to retrieve the initial ranked list of tweets) and 3 different TTG models (used to generate the final timeline). Our results demonstrate the effectiveness of the proposed approach; it managed to improve TTG performance over a strong baseline in 76% of the cases, out of which 31% were statistically significant, with no single significant degradation observed.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages135-146
Number of pages12
Volume9460
ISBN (Print)9783319289397
DOIs
Publication statusPublished - 2015
Event11th Asia Information Retrieval Societies Conference, AIRS 2015 - Brisbane, Australia
Duration: 2 Dec 20154 Dec 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9460
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other11th Asia Information Retrieval Societies Conference, AIRS 2015
CountryAustralia
CityBrisbane
Period2/12/154/12/15

    Fingerprint

Keywords

  • Dynamic retrieval cutoff
  • Microblogs
  • Query difficulty
  • Query performance prediction
  • Regression
  • Tweet summarization

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Hasanain, M., Elsayed, T., & Magdy, W. (2015). Improving tweet timeline generation by predicting optimal retrieval depth. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9460, pp. 135-146). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9460). Springer Verlag. https://doi.org/10.1007/978-3-319-28940-3_11