ShenTu: Processing multi-trillion edge graphs on millions of cores in seconds

Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, Weimin Zheng, Jingfang Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Graphs are an important abstraction used in many scientific fields. With the magnitude of graph-structured data constantly increasing, effective data analytics requires efficient and scalable graph processing systems. Although HPC systems have long been used for scientific computing, people have only recently started to assess their potential for graph processing, a workload with inherent load imbalance, lack of locality, and access irregularity. We propose ShenTu 8 the first general-purpose graph processing framework that can efficiently utilize an entire Petascale system to process multi-trillion edge graphs in seconds. ShenTu embodies four key innovations: hardware specialization, supernode routing, on-chip sorting, and degree-aware messaging, which together enable its unprecedented performance and scalability. It can traverse a record-size 70-trillion-edge graph in seconds. Furthermore, ShenTu enables the processing of a spam detection problem on a 12-trillion edge Internet graph, making it possible to identify trustworthy and spam webpages directly at the fine-grained page level.

Original languageEnglish
Title of host publicationProceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages706-716
Number of pages11
ISBN (Electronic)9781538683842
DOIs
Publication statusPublished - 11 Mar 2019
Event2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 - Dallas, United States
Duration: 11 Nov 201816 Nov 2018

Publication series

NameProceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018

Conference

Conference2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
CountryUnited States
CityDallas
Period11/11/1816/11/18

    Fingerprint

Keywords

  • Application programming interfaces
  • Big data applications
  • Data analysis
  • Graph theory
  • Supercomputers

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Hardware and Architecture
  • Theoretical Computer Science

Cite this

Lin, H., Zhu, X., Yu, B., Tang, X., Xue, W., Chen, W., Zhang, L., Hoefler, T., Ma, X., Liu, X., Zheng, W., & Xu, J. (2019). ShenTu: Processing multi-trillion edge graphs on millions of cores in seconds. In Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 (pp. 706-716). [8665798] (Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SC.2018.00059