Aggregate estimation over a microblog platform

Saravanan Thirumuruganathan, Nan Zhang, Vagelis Hristidis, Gautam Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Microblogging platforms such as Twitter have experienced a phenomenal growth of popularity in recent years, making them attractive platforms for research in diverse fields from computer science to sociology. However, most microblogging platforms impose strict access restrictions (e.g., API rate limits) that prevent scientists with limited resources-e.g., who cannot afford microblog-data-access subscriptions offered by GNIP et al.-to leverage the wealth of microblogs for analytics. For example, Twitter allows only 180 queries per 15 minutes, and its search API only returns tweets posted within the last week. In this paper, we consider a novel problem of estimating aggregate queries over microblogs, e.g., "how many users mentioned the word 'privacy' in 2013?". We propose novel solutions exploiting the user-timeline information that is publicly available in most microblogging platforms. Theoretical analysis and extensive real-world experiments over Twitter, Google+ and Tumblr confirm the effectiveness of our proposed techniques.

Original languageEnglish
Title of host publicationSIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1519-1530
Number of pages12
ISBN (Print)9781450323765
DOIs
Publication statusPublished - 1 Jan 2014
Externally publishedYes
Event2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014 - Snowbird, UT, United States
Duration: 22 Jun 201427 Jun 2014

Other

Other2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014
CountryUnited States
CitySnowbird, UT
Period22/6/1427/6/14

Fingerprint

Application programming interfaces (API)
Computer science
Experiments

Keywords

  • Aggregate estimation
  • Microblogs
  • Random walk
  • Twitter

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Thirumuruganathan, S., Zhang, N., Hristidis, V., & Das, G. (2014). Aggregate estimation over a microblog platform. In SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (pp. 1519-1530). Association for Computing Machinery. https://doi.org/10.1145/2588555.2610517

Aggregate estimation over a microblog platform. / Thirumuruganathan, Saravanan; Zhang, Nan; Hristidis, Vagelis; Das, Gautam.

SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 2014. p. 1519-1530.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Thirumuruganathan, S, Zhang, N, Hristidis, V & Das, G 2014, Aggregate estimation over a microblog platform. in SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, pp. 1519-1530, 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, United States, 22/6/14. https://doi.org/10.1145/2588555.2610517
Thirumuruganathan S, Zhang N, Hristidis V, Das G. Aggregate estimation over a microblog platform. In SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery. 2014. p. 1519-1530 https://doi.org/10.1145/2588555.2610517
Thirumuruganathan, Saravanan ; Zhang, Nan ; Hristidis, Vagelis ; Das, Gautam. / Aggregate estimation over a microblog platform. SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 2014. pp. 1519-1530
@inproceedings{ed42b365412a4d399f9d4107d0ef8f8c,
title = "Aggregate estimation over a microblog platform",
abstract = "Microblogging platforms such as Twitter have experienced a phenomenal growth of popularity in recent years, making them attractive platforms for research in diverse fields from computer science to sociology. However, most microblogging platforms impose strict access restrictions (e.g., API rate limits) that prevent scientists with limited resources-e.g., who cannot afford microblog-data-access subscriptions offered by GNIP et al.-to leverage the wealth of microblogs for analytics. For example, Twitter allows only 180 queries per 15 minutes, and its search API only returns tweets posted within the last week. In this paper, we consider a novel problem of estimating aggregate queries over microblogs, e.g., {"}how many users mentioned the word 'privacy' in 2013?{"}. We propose novel solutions exploiting the user-timeline information that is publicly available in most microblogging platforms. Theoretical analysis and extensive real-world experiments over Twitter, Google+ and Tumblr confirm the effectiveness of our proposed techniques.",
keywords = "Aggregate estimation, Microblogs, Random walk, Twitter",
author = "Saravanan Thirumuruganathan and Nan Zhang and Vagelis Hristidis and Gautam Das",
year = "2014",
month = "1",
day = "1",
doi = "10.1145/2588555.2610517",
language = "English",
isbn = "9781450323765",
pages = "1519--1530",
booktitle = "SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Aggregate estimation over a microblog platform

AU - Thirumuruganathan, Saravanan

AU - Zhang, Nan

AU - Hristidis, Vagelis

AU - Das, Gautam

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Microblogging platforms such as Twitter have experienced a phenomenal growth of popularity in recent years, making them attractive platforms for research in diverse fields from computer science to sociology. However, most microblogging platforms impose strict access restrictions (e.g., API rate limits) that prevent scientists with limited resources-e.g., who cannot afford microblog-data-access subscriptions offered by GNIP et al.-to leverage the wealth of microblogs for analytics. For example, Twitter allows only 180 queries per 15 minutes, and its search API only returns tweets posted within the last week. In this paper, we consider a novel problem of estimating aggregate queries over microblogs, e.g., "how many users mentioned the word 'privacy' in 2013?". We propose novel solutions exploiting the user-timeline information that is publicly available in most microblogging platforms. Theoretical analysis and extensive real-world experiments over Twitter, Google+ and Tumblr confirm the effectiveness of our proposed techniques.

AB - Microblogging platforms such as Twitter have experienced a phenomenal growth of popularity in recent years, making them attractive platforms for research in diverse fields from computer science to sociology. However, most microblogging platforms impose strict access restrictions (e.g., API rate limits) that prevent scientists with limited resources-e.g., who cannot afford microblog-data-access subscriptions offered by GNIP et al.-to leverage the wealth of microblogs for analytics. For example, Twitter allows only 180 queries per 15 minutes, and its search API only returns tweets posted within the last week. In this paper, we consider a novel problem of estimating aggregate queries over microblogs, e.g., "how many users mentioned the word 'privacy' in 2013?". We propose novel solutions exploiting the user-timeline information that is publicly available in most microblogging platforms. Theoretical analysis and extensive real-world experiments over Twitter, Google+ and Tumblr confirm the effectiveness of our proposed techniques.

KW - Aggregate estimation

KW - Microblogs

KW - Random walk

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=84904339927&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904339927&partnerID=8YFLogxK

U2 - 10.1145/2588555.2610517

DO - 10.1145/2588555.2610517

M3 - Conference contribution

SN - 9781450323765

SP - 1519

EP - 1530

BT - SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

PB - Association for Computing Machinery

ER -