Mesa

Georeplicated, near realtime, scalable data warehousing

Ashish Gupta, Fan Yang, Jason Govig, Adam Kirsch, Kelvin Chan, Kevin Lai, Shuo Wu, Sandeep Govind Dhoot, Abhilash Rajesh Kumar, Ankur Agiwal, Sanjay Bhansali, Mingsheng Hong, Jamie Cameron, Masood Siddiqi, David Jones, Jeff Shute, Andrey Gubarev, Shivakumar Venkataraman, Divyakant Agrawal

Research output: Chapter in Book/Report/Conference proceedingChapter

39 Citations (Scopus)

Abstract

Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
PublisherAssociation for Computing Machinery
Pages1259-1270
Number of pages12
Volume7
Edition12
Publication statusPublished - 2014
Externally publishedYes

Fingerprint

Data warehouses
Fault tolerance
Scalability
Marketing
Availability
Internet
Industry

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Gupta, A., Yang, F., Govig, J., Kirsch, A., Chan, K., Lai, K., ... Agrawal, D. (2014). Mesa: Georeplicated, near realtime, scalable data warehousing. In Proceedings of the VLDB Endowment (12 ed., Vol. 7, pp. 1259-1270). Association for Computing Machinery.

Mesa : Georeplicated, near realtime, scalable data warehousing. / Gupta, Ashish; Yang, Fan; Govig, Jason; Kirsch, Adam; Chan, Kelvin; Lai, Kevin; Wu, Shuo; Dhoot, Sandeep Govind; Kumar, Abhilash Rajesh; Agiwal, Ankur; Bhansali, Sanjay; Hong, Mingsheng; Cameron, Jamie; Siddiqi, Masood; Jones, David; Shute, Jeff; Gubarev, Andrey; Venkataraman, Shivakumar; Agrawal, Divyakant.

Proceedings of the VLDB Endowment. Vol. 7 12. ed. Association for Computing Machinery, 2014. p. 1259-1270.

Research output: Chapter in Book/Report/Conference proceedingChapter

Gupta, A, Yang, F, Govig, J, Kirsch, A, Chan, K, Lai, K, Wu, S, Dhoot, SG, Kumar, AR, Agiwal, A, Bhansali, S, Hong, M, Cameron, J, Siddiqi, M, Jones, D, Shute, J, Gubarev, A, Venkataraman, S & Agrawal, D 2014, Mesa: Georeplicated, near realtime, scalable data warehousing. in Proceedings of the VLDB Endowment. 12 edn, vol. 7, Association for Computing Machinery, pp. 1259-1270.
Gupta A, Yang F, Govig J, Kirsch A, Chan K, Lai K et al. Mesa: Georeplicated, near realtime, scalable data warehousing. In Proceedings of the VLDB Endowment. 12 ed. Vol. 7. Association for Computing Machinery. 2014. p. 1259-1270
Gupta, Ashish ; Yang, Fan ; Govig, Jason ; Kirsch, Adam ; Chan, Kelvin ; Lai, Kevin ; Wu, Shuo ; Dhoot, Sandeep Govind ; Kumar, Abhilash Rajesh ; Agiwal, Ankur ; Bhansali, Sanjay ; Hong, Mingsheng ; Cameron, Jamie ; Siddiqi, Masood ; Jones, David ; Shute, Jeff ; Gubarev, Andrey ; Venkataraman, Shivakumar ; Agrawal, Divyakant. / Mesa : Georeplicated, near realtime, scalable data warehousing. Proceedings of the VLDB Endowment. Vol. 7 12. ed. Association for Computing Machinery, 2014. pp. 1259-1270
@inbook{b769403449c847b48de905aad5710cf9,
title = "Mesa: Georeplicated, near realtime, scalable data warehousing",
abstract = "Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.",
author = "Ashish Gupta and Fan Yang and Jason Govig and Adam Kirsch and Kelvin Chan and Kevin Lai and Shuo Wu and Dhoot, {Sandeep Govind} and Kumar, {Abhilash Rajesh} and Ankur Agiwal and Sanjay Bhansali and Mingsheng Hong and Jamie Cameron and Masood Siddiqi and David Jones and Jeff Shute and Andrey Gubarev and Shivakumar Venkataraman and Divyakant Agrawal",
year = "2014",
language = "English",
volume = "7",
pages = "1259--1270",
booktitle = "Proceedings of the VLDB Endowment",
publisher = "Association for Computing Machinery",
edition = "12",

}

TY - CHAP

T1 - Mesa

T2 - Georeplicated, near realtime, scalable data warehousing

AU - Gupta, Ashish

AU - Yang, Fan

AU - Govig, Jason

AU - Kirsch, Adam

AU - Chan, Kelvin

AU - Lai, Kevin

AU - Wu, Shuo

AU - Dhoot, Sandeep Govind

AU - Kumar, Abhilash Rajesh

AU - Agiwal, Ankur

AU - Bhansali, Sanjay

AU - Hong, Mingsheng

AU - Cameron, Jamie

AU - Siddiqi, Masood

AU - Jones, David

AU - Shute, Jeff

AU - Gubarev, Andrey

AU - Venkataraman, Shivakumar

AU - Agrawal, Divyakant

PY - 2014

Y1 - 2014

N2 - Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.

AB - Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.

UR - http://www.scopus.com/inward/record.url?scp=84905096068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905096068&partnerID=8YFLogxK

M3 - Chapter

VL - 7

SP - 1259

EP - 1270

BT - Proceedings of the VLDB Endowment

PB - Association for Computing Machinery

ER -